Skip to content

fix(databricks_zerobus sink): defer Unity Catalog schema fetch out of build()#25408

Open
flaviofcruz wants to merge 5 commits into
vectordotdev:masterfrom
flaviofcruz:zerobus-defer-descriptor
Open

fix(databricks_zerobus sink): defer Unity Catalog schema fetch out of build()#25408
flaviofcruz wants to merge 5 commits into
vectordotdev:masterfrom
flaviofcruz:zerobus-defer-descriptor

Conversation

@flaviofcruz
Copy link
Copy Markdown
Contributor

@flaviofcruz flaviofcruz commented May 11, 2026

Summary

databricks_zerobus's SinkConfig::build() synchronously calls Unity Catalog to fetch the table's protobuf descriptor. If the table doesn't exist or credentials are wrong, the call fails inside build() and Vector exits before the sink even starts — even when healthcheck.enabled = false.

This aligns the sink with the convention used by AWS S3, Kafka, etc.: build() does only local setup, the healthcheck does the remote probe, and runtime failures surface per-batch via the existing retry/event-status path.

Changes:

  • UC descriptor + encoder + stream-mode are resolved lazily via tokio::sync::OnceCell on first use (and re-attempted on each failure).
  • Event encoding moves from ZerobusSink into ZerobusService::encode_batch, since the encoder now depends on the lazily-resolved descriptor.
  • ZerobusService::new only builds the SDK client and the HTTP client (both local).
  • ensure_stream (the healthcheck) is the natural gate: it triggers schema resolution + stream creation, and runs the same way on first ingest if the healthcheck is disabled.
  • Replaced the eager-ProxyConfig-stash with an HttpClient built once in new, so fetch_table_schema takes &HttpClient and doesn't reconstruct it per retry.
  • Added logic to decide whether schema fetch errors should be treated as retryable or not. If UC is transiently failing, we will try again during ingestion. Potentially this opens the door to dynamically re-fetch the schema as time goes on.

Vector configuration

  sinks:
    zb:
      type: databricks_zerobus
      inputs: [demo]
      ingestion_endpoint: "https://ingest.dev.databricks.com"
      unity_catalog_endpoint: "https://workspace.cloud.databricks.com"
      table_name: "main.default.does_not_exist"
      auth:
        strategy: oauth
        client_id: "${DATABRICKS_CLIENT_ID}"
        client_secret: "${DATABRICKS_CLIENT_SECRET}"
      healthcheck:
        enabled: false

Before: Vector exits at startup with Unity Catalog API returned error 404: ....
After: Vector starts; batches are rejected at ingest time with the same error logged per batch.

How did you test this PR?

  • Manual smoke tests still to do:
  • Non-existent table with default settings → sink starts, events rejected.
  • Bad OAuth credentials → sink starts, events rejected.
  • healthcheck.enabled = false with unreachable UC → sink starts.
  • require_healthy = true → Vector exits (opt-in fail-fast preserved).

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details on the dd-rust-license-tool.

@github-actions github-actions Bot added the domain: sinks Anything related to the Vector's sinks label May 11, 2026
@flaviofcruz flaviofcruz force-pushed the zerobus-defer-descriptor branch from 30bffb2 to 83ebd94 Compare May 11, 2026 16:45
@flaviofcruz flaviofcruz marked this pull request as ready for review May 11, 2026 19:46
@flaviofcruz flaviofcruz requested a review from a team as a code owner May 11, 2026 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: sinks Anything related to the Vector's sinks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant