[voltdb] Switch to native VoltDB Python client#23667
Conversation
Migrate from the deprecated HTTP/JSON interface to the native voltdbclient Python package, which speaks the binary protocol on the VoltDB client port. Instance configuration now takes 'host' and 'port' (default 21212) instead of 'url'. TLS is configured via 'use_ssl' and 'ssl_config_file'. The legacy 'url' option is still accepted with a deprecation warning so existing deployments keep working unchanged: the host is parsed from the URL and the default native client port is used. Statistics columns are now resolved by name against the VoltDB response metadata so the check tolerates VoltDB releases that add or drop columns to @statistics outputs.
|
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6abbd0a571
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
Confirming the major version bump is intentional. This PR swaps the wire protocol from HTTP/JSON to VoltDB's native binary protocol, which means:
A |
Addresses Codex review feedback: the previous code path passed procedure_timeout=None to FastSerializer when the option was omitted, which means 'wait indefinitely'. The old HTTP integration had a 10s default timeout, so leaving the native client with no default is a regression that could block a check run forever on a hung procedure. Default to 60s (matching the example we ship in conf.yaml.example). Setting procedure_timeout to 0 or any non-positive number restores the 'wait indefinitely' behavior for users who explicitly want it.
The CI 'Lint' step failed because ruff lint/isort is invoked from inside the integration directory with --config ../pyproject.toml, where the local package (datadog_checks.voltdb) is treated as first-party and needs its own import block. Running ruff from the repo root (as I did locally) didn't catch this. Apply the isort fix that ruff --fix produces under that working directory. Also regenerated LICENSE-3rdparty.csv via 'ddev validate licenses --sync' to add voltdbclient's MIT license entry, which CI flagged as missing. The other line changes in that file are ddev's current copyright-parser output for existing entries (no upstream changes), kept to satisfy the validator.
The previous commits replaced the HTTP/JSON transport with the native binary client. As feedback noted, some operators connect to VoltDB through the VoltDB Management Center (VMC) rather than directly to database nodes — those deployments need the HTTP transport. Make the transport choice config-driven instead of removing one of them: setting 'url' selects the HTTP client (talks to VMC), and setting 'host' selects the native binary client. Everything else (auth, statistics components, custom queries, tags) is shared. This restores full backwards compatibility for existing 'url'-based configs along with all their HTTP-only options (password_hashed, proxy, tls_cert / tls_ca_cert / tls_verify, headers, etc.) via the instances/http template, and reclassifies the changelog entry from 'changed' to 'added' since nothing is being removed. Common response shape: HttpClient wraps the JSON response so the check code reads response.tables[i].columns[j].name / .tuples on both paths, with no mode-specific branching in _execute_query_raw. New unit test 'test_http_mode_end_to_end' patches requests.Session.get and walks the HTTP code path against a fixture; existing native tests stay green. 27 unit tests pass; live native mode against VoltDB 14.2 still emits 44 metric families / 184 series cleanly.
Lets the Agent connect to whichever VoltDB cluster member is reachable instead of pinning to a single host. New `hosts` instance option takes a list of `hostname` or `hostname:port` strings; the native Client tries them in order on each (re)connect and surfaces the last error only when every endpoint refuses. Backwards compatible: single-`host:` configs keep working unchanged (they expand to a one-entry endpoint list). `hosts` takes precedence when both are set so users can opt into failover with a single add. Tested live against a local VoltDB 15.3 cluster: - `host: localhost` -> 44 metric families, 184 series (unchanged). - `hosts: [dead.example:21212, localhost:21212]` -> dead endpoint is skipped with a warning, real cluster picks up, active_endpoint correctly reflects 'localhost:21212'. Also tested live HTTP/VMC mode against the local VMC at port 8080: 44 metric families, 208 series, service check OK. Confirms the HTTP client wraps VMC's JSON response into the same shape `_execute_query_raw` expects and the unified code path works for both transports.
|
Updating my earlier confirmation: this is no longer a major version bump. After feedback that the HTTP transport needed to stay available for users connecting through VoltDB Management Center (VMC), I restored the HTTP client as a parallel transport rather than replacing it. Both transports now coexist, with Because nothing is removed and the wire-level behavior of existing Summary of what's preserved verbatim:
Summary of what's added (new options, all optional):
Live-tested against a local VoltDB 15.3 cluster: 44 metric families with |
CI runs ruff 0.11.10 under --config ../pyproject.toml from inside the integration directory. That older ruff is stricter about the isort boundary between third-party and first-party imports than the 0.15+ I had installed locally, so what passed on my machine still failed on the runner with 7 I001 errors across the tests/ tree. Pinned my local ruff to 0.11.10 to match CI exactly, ran `ruff check --fix --config ../pyproject.toml .`, and confirmed the resulting layout is what CI expects. 35 unit tests still pass. Also picks up the license-header fix on the new http_client.py (2020-present -> 2026-present, what ddev validate license-headers --fix expects for a newly added file) and the latest sync of LICENSE-3rdparty.csv.
Codecov Report❌ Patch coverage is Additional details and impacted files🚀 New features to boost your workflow:
|
voltdbclient supports pointing ssl_config_file at a PEM truststore without needing a Java keystore properties file. The compose fixtures already ship ca.pem; just reuse it for both host-side integration tests (tests/common.py:TLS_CONFIG_FILE) and the agent-container e2e path (tests/conftest.py:dd_environment). Fixes the FileNotFoundError for client_ssl.properties on the with-tls matrix variants.
Codecov flagged 53 lines without coverage on the patch. Most were in client.py's call_procedure retry-once path, type inference, and close() error swallowing, plus HttpClient's parameter encoding. New tests added (8): - test_client_requires_at_least_one_endpoint - test_client_call_procedure_returns_response - test_client_call_procedure_retries_once_on_stale_connection - test_client_raise_for_status (success + VoltDBError path) - test_client_close_is_idempotent (no-conn + open-then-close) - test_infer_volt_type_distinguishes_bool_int_float_string - test_http_client_serializes_list_params_as_json - test_http_client_raise_for_status Local unit-test coverage: client.py 53% -> 91% (+38pp) http_client.py 89% -> 94% (+5pp) total 88% -> 94% (+6pp) All 43 unit tests pass; ruff check and ruff format --diff --check both green under CI's pinned ruff 0.11.10.
Not a swap old protocol (http(s)) works as before allows native client to be used for those who disable vmc or is removed from voltdb servers in newer versions. |
Using |
|
Thanks for the approval, @brett0000FF! 🙏 Two remaining checks need someone with write access to the repo to finish — I tried both as the PR author from my fork and got
When you have a minute, would you mind running the Status snapshot:
|
What does this PR do?
Adds the native VoltDB Python client (binary protocol on the VoltDB client port, default
21212) as a new transport for the integration, alongside the existing HTTP/JSON transport that talks to the VoltDB Management Center (VMC).The transport is selected by which option is set in
voltdb.d/conf.yaml:url:→ HTTP/JSON via VMC (existing behavior, kept verbatim)host:/hosts:→ native binary client (new)For multi-node clusters, the new
hosts:option takes a list ofhostnameorhostname:portstrings; the Agent connects to the first reachable entry and silently fails over to the others if a node becomes unavailable:Statistics columns are now resolved by name against the VoltDB response metadata, so the check tolerates VoltDB releases that add or drop columns to
@Statisticsoutputs (verified against 14.2 and 15.3 locally).Not a breaking change
Existing
url:configurations continue to work without any modifications. This PR is additive:url,username,password,password_hashed,tls_cert,tls_ca_cert,tls_verify,tls_private_key,tls_ciphers,proxy,skip_proxy,headers,extra_headers,connect_timeout,read_timeout,timeout,request_size, etc.) is preserved via the sameinstances/httptemplate the integration used before.url: http://vmc.example:8080config (or pointing at the database HTTP port) does not need to change anything. The check picks the HTTP transport and emits the same metrics it always has against the same endpoint.<user>entries indeployment.xml.The changelog entry is
added(notchanged/removed), so this is a minor version bump (6.4.1 → 6.5.0).Motivation
Submitted on behalf of Volt Active Data. The native Python client is the supported integration surface for direct database-node connections and exposes the same
@Statisticsdata over a stable binary protocol. Adding it as an option (without removing HTTP/VMC) lets operators choose the right transport for their topology: native for direct connections to nodes, HTTP via VMC when only the management interface is reachable.Verified live (VoltDB 14.2 + 15.3, single-node)
host: localhosthosts: [dead.example:21212, localhost:21212]localhostpicks up, 44 families / 232 seriesurl: http://localhost:8080Plus 35 unit tests covering: URL→HTTP-mode dispatch, host→native-mode dispatch, hosts-list expansion,
hosts:precedence overhost:, port-parse validation, failover behavior in the native client, end-to-end HTTP request shape (viarequests.Session.getmock), defensive@Statisticscolumn-by-name lookup,procedure_timeoutdefault of 60s.Notes for reviewers
voltdb/tests/compose/docker-compose.yamlmatrix file changes its published port from8080(HTTP) to21212(native) so the integration tests can exercise the new transport. The existing TLS/non-TLS variants inhatch.tomlare kept.voltdb/datadog_checks/voltdb/config_models/instance.pyis autogenerated fromassets/configuration/spec.yaml; runddev validate config -s voltdb --syncandddev validate models -s voltdb --syncto regenerate if you tweak the spec.password_hashedconfig-side affordance only takes effect in HTTP mode (the nativevoltdbclientlibrary always hashes the cleartext password client-side, so a pre-hashed value can't be wired through that auth handshake). README and conf example call this out.Review checklist (to be filled by reviewers)
qa/skip-qalabel if the PR doesn't need to be tested during QA.backport/<branch-name>label to the PR and it will automatically open a backport PR once this one is merged