Skip to content

feat(codecs): Allow encoder to add avro schema id prefix#25351

Open
YaZasnyal wants to merge 2 commits intovectordotdev:masterfrom
YaZasnyal:avro-encoder-schema-id
Open

feat(codecs): Allow encoder to add avro schema id prefix#25351
YaZasnyal wants to merge 2 commits intovectordotdev:masterfrom
YaZasnyal:avro-encoder-schema-id

Conversation

@YaZasnyal
Copy link
Copy Markdown

Summary

This change allows to add a schema_id to avro encoded messages.

Vector configuration

sources:
  in:
    type: "stdin"

sinks:
  out:
    inputs:
      - "in"
    type: "console"
    encoding:
      codec: "avro"
      avro:
        schema: "{ \"type\": \"record\", \"name\": \"log\", \"fields\": [{ \"name\": \"message\", \"type\": \"string\" }] }"
        schema_id: 42

How did you test this PR?

I have created a unit test for this PR and tested it locally using stdin-stdout pipeline that produced properly encoded messages.

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Closes: #19872

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details on the dd-rust-license-tool.

@YaZasnyal YaZasnyal requested review from a team as code owners May 1, 2026 20:52
@github-actions github-actions Bot added work in progress domain: external docs Anything related to Vector's external, public documentation and removed work in progress labels May 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@YaZasnyal
Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

@YaZasnyal YaZasnyal changed the title feat(kafka): Allow encoder to add avro schema id prefix feat(codecs): Allow encoder to add avro schema id prefix May 1, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 20c1397ae3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

///
/// [wire_format]: https://docs.confluent.io/platform/current/schema-registry/fundamentals/serdes-develop/index.html#wire-format
#[configurable(metadata(docs::examples = "42"))]
pub schema_id: Option<i32>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject negative schema IDs in Avro encoder config

The new schema_id field is declared as Option<i32>, and encode writes whatever value is provided directly into the Confluent wire-format prefix. This permits negative IDs (for example -1), which do not correspond to valid Schema Registry IDs and will produce Avro payloads that downstream consumers or broker-side schema validation cannot resolve. This issue is triggered whenever a negative schema_id is configured, so the config should enforce non-negative values (for example u32 or explicit validation in build).

Useful? React with 👍 / 👎.

@YaZasnyal
Copy link
Copy Markdown
Author

YaZasnyal commented May 4, 2026

@vectordotdev, hi

What do you think about adding this param? I saw another MR trying to add schema registry but it will require a huge refactoring because it introduces async calls into the sync codecs code. This feature will probably satisfy some of the users including myself. It should be forward compatible with future schema registry integration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: external docs Anything related to Vector's external, public documentation work in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Kafka Sink: Support encoding Confluent Kafka wire format

1 participant