input_metric: split large metric contexts into chunk-sized batches#11554
input_metric: split large metric contexts into chunk-sized batches#11554thewillyhuman wants to merge 1 commit intofluent:masterfrom
Conversation
📝 WalkthroughWalkthroughAdds internal batch-splitting logic to the input metrics path to handle oversized encoded metric batches by copying static labels/metadata and iteratively creating, encoding, and appending smaller batches; also adds unit tests exercising copy, splitting, mixed-type batching, and size calculations. Changes
Sequence DiagramsequenceDiagram
participant Input as Input Metrics
participant Append as input_metrics_append
participant Fast as Fast Path
participant Split as input_metrics_split_and_append
participant Builder as Batch Builder (tmp cmt)
participant Encoder as Encoder (msgpack)
participant Out as Output Context
Input->>Append: request append metric batch
Append->>Encoder: attempt encode whole batch
alt encoded size <= chunk
Encoder-->>Append: encoded data
Append->>Fast: fast append to Out
Fast->>Out: append chunk
Out-->>Append: success
else encoded size > chunk
Encoder-->>Append: over limit
Append->>Split: split and append
Split->>Builder: compute families_per_batch & create batch contexts
loop for each batch
Builder->>Encoder: encode batch families
Encoder-->>Builder: encoded chunk
Builder->>Out: append encoded chunk (copy labels/metadata)
Out-->>Builder: success/error (no rollback)
Builder->>Builder: destroy batch context
end
Split-->>Append: return aggregated result
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b580c0b001
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "could not create batch cmt context"); \ | ||
| goto error; \ | ||
| } \ | ||
| ret = copy_static_labels(batch, src); \ |
There was a problem hiding this comment.
Preserve metadata when splitting oversized metric contexts
Each split batch is created from cmt_create() and only static labels are copied (copy_static_labels), so the original internal_metadata/external_metadata are dropped whenever the slow-path splitter is used. That is a functional regression for large contexts because OTLP encoding relies on cmt->external_metadata (resource/scope/data-point metadata) to preserve grouping and attributes, so oversized OpenTelemetry payloads lose metadata that is preserved for non-split payloads.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed. Added copy_metadata() helper that deep-copies both internal_metadata and external_metadata kvlists to each batch context. It iterates each kvpair and copies by variant type (string, int64, uint64, double, bool, bytes). This is now called alongside copy_static_labels() when creating each batch, so OTLP resource/scope metadata is preserved across splits.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
tests/internal/input_metric.c (1)
127-223: The split tests don't exercise the production path.Both tests hand-roll the batching loop instead of calling
flb_input_metrics_append(), and they only count recovered families. That misses regressions in the real helper such as dropped static labels/metadata, append-path cleanup, or batches that still exceedFLB_INPUT_CHUNK_FS_MAX_SIZE. Please add at least one end-to-end test with uneven family sizes and assert every emitted chunk stays within the limit.Also applies to: 231-355
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/internal/input_metric.c` around lines 127 - 223, The test currently hand-rolls batching logic in test_batch_split_preserves_families() instead of exercising the real production helper; update or add an end-to-end test that builds a src cmt with uneven family sizes and then calls flb_input_metrics_append() (the real append/splitting path) to produce chunks, verify every emitted chunk size is <= FLB_INPUT_CHUNK_FS_MAX_SIZE and that static labels/metadata from the original counters are preserved in each decoded chunk (use cmt_decode_msgpack_create to inspect chunks), and assert the total recovered families equals the source; target symbols: flb_input_metrics_append, FLB_INPUT_CHUNK_FS_MAX_SIZE, cmt_encode_msgpack_create/cmt_decode_msgpack_create, and the test function test_batch_split_preserves_families (or add a new test function) so the real append logic is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/flb_input_metric.c`:
- Around line 149-158: The new batch creation path (inside the batch == NULL
branch where cmt_create() is called and copy_static_labels(batch, src) is
invoked) currently drops cmt-level internal/external metadata present on src
before cmt_cat_*() is used, causing payload divergence with the fast path; fix
this by copying the src cmt-level KV lists into the new batch after cmt_create()
(or detect their presence and fail early) — add a call to copy the
internal/external cmt metadata into batch (or bail out with flb_plg_error and
goto error when such metadata exists) so that batch contains the same cmt-level
metadata as src prior to any cmt_cat_*() operations.
- Around line 117-125: The current heuristic computes families_per_batch from a
global average (numerator/total_encoded_size) which can still produce mt_size >
FLB_INPUT_CHUNK_FS_MAX_SIZE for skewed family sizes; replace that estimate with
an actual-encoded-size loop: start with families_per_batch = 0 (but ensure at
least 1), then iteratively encode/measure the batch payload (using the same
encoding path that produces mt_size) adding one family at a time until adding
the next family would exceed FLB_INPUT_CHUNK_FS_MAX_SIZE, and use that count for
the batch; reference and remove reliance on numerator/total_encoded_size and
ensure the mt_size check is done proactively before appending the chunk rather
than only warning after append.
In `@tests/internal/input_metric.c`:
- Around line 105-112: The test only asserts size_src == size_dst which doesn't
prove cmt_cat_counter() copied data correctly; instead decode and compare
payloads or decoded structures: after calling cmt_encode_msgpack_create(src,
&buf_src, &size_src) and cmt_encode_msgpack_create(dst, &buf_dst, &size_dst) use
the msgpack decoder used in tests (or a deterministic byte-for-byte comparison
if the encoder is deterministic) to decode buf_dst and buf_src and assert that
the decoded family/counter entries (labels, sample values, timestamps) are
equal; alternatively, decode buf_dst and validate the specific counter
family/metric values expected from the source to ensure cmt_cat_counter()
preserved labels and samples rather than relying on size equality.
---
Nitpick comments:
In `@tests/internal/input_metric.c`:
- Around line 127-223: The test currently hand-rolls batching logic in
test_batch_split_preserves_families() instead of exercising the real production
helper; update or add an end-to-end test that builds a src cmt with uneven
family sizes and then calls flb_input_metrics_append() (the real
append/splitting path) to produce chunks, verify every emitted chunk size is <=
FLB_INPUT_CHUNK_FS_MAX_SIZE and that static labels/metadata from the original
counters are preserved in each decoded chunk (use cmt_decode_msgpack_create to
inspect chunks), and assert the total recovered families equals the source;
target symbols: flb_input_metrics_append, FLB_INPUT_CHUNK_FS_MAX_SIZE,
cmt_encode_msgpack_create/cmt_decode_msgpack_create, and the test function
test_batch_split_preserves_families (or add a new test function) so the real
append logic is exercised.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8bcc8f49-d66f-4815-bd0c-29549da5f423
📒 Files selected for processing (3)
src/flb_input_metric.ctests/internal/CMakeLists.txttests/internal/input_metric.c
Addresses fluent#9653 Large metric scrapes (e.g., 200K+ metrics from prometheus_scrape) were encoded as a single chunk exceeding the 2MB limit, breaking downstream plugins like out_opentelemetry. Add a size-aware split path to flb_input_metrics_append() that batches metric families into chunk-sized cmt contexts using cmt_cat_*() before encoding. Small payloads are unaffected (fast path preserved). Signed-off-by: Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
b580c0b to
eb84b31
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/flb_input_metric.c (1)
108-111: Consider logging when unsupported kvlist types are skipped.Silently skipping arrays and nested kvlists may cause subtle data loss if OTLP resource/scope metadata uses these types. A debug-level log could help with troubleshooting when expected metadata is missing.
💡 Suggested enhancement
default: - /* Skip unsupported types (arrays, nested kvlists, references) */ - ret = 0; + /* Skip unsupported types (arrays, nested kvlists, references). + * Log at trace level for debugging metadata preservation issues. */ + flb_trace("[copy_kvlist] skipping unsupported type %d for key '%s'", + pair->val->type, pair->key); + ret = 0; break;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/flb_input_metric.c` around lines 108 - 111, In the switch's default branch (where it currently does "ret = 0;" and breaks) add a debug-level log that notes an unsupported kvlist type was skipped, include context like the key name and the type encountered; update the handling in the same function in src/flb_input_metric.c (the default switch case that sets ret = 0) to call the module logger at debug level before returning so operators can trace missing OTLP resource/scope metadata without changing behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/flb_input_metric.c`:
- Around line 108-111: In the switch's default branch (where it currently does
"ret = 0;" and breaks) add a debug-level log that notes an unsupported kvlist
type was skipped, include context like the key name and the type encountered;
update the handling in the same function in src/flb_input_metric.c (the default
switch case that sets ret = 0) to call the module logger at debug level before
returning so operators can trace missing OTLP resource/scope metadata without
changing behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e3355b91-6230-4e5a-9f96-09fb91a72661
📒 Files selected for processing (3)
src/flb_input_metric.ctests/internal/CMakeLists.txttests/internal/input_metric.c
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/internal/CMakeLists.txt
Addresses #9653
Summary
flb_input_metrics_append()so large metric payloads are automatically split into multiple chunks respecting the 2MB chunk size limitProblem
When using input plugins that collect a large number of metrics in a single operation (e.g.,
prometheus_scrapewith 200K+ metrics),flb_input_metrics_append()encodes the entire CMetrics context into one monolithic msgpack buffer and writes it as a single chunk.This chunk can far exceed the 2MB chunk size limit (
FLB_INPUT_CHUNK_FS_MAX_SIZE), causing downstream plugins — notablyout_opentelemetry— to fail when processing the oversized payload.The existing 2MB chunk limit only locks a chunk after it is already oversized (preventing future writes to it), but does nothing to prevent a single write from exceeding it in the first place. There was simply no mechanism to split a large metric payload across multiple chunks.
This did not affect
in_prometheus_remote_writebecause the sender controls batch size — each HTTP POST is a separateflb_input_metrics_append()call that naturally produces small chunks. But any input that collects all metrics in one shot (prometheus_scrape,node_exporter_metrics, etc.) is affected.Root cause
flb_input_metrics_append()unconditionally:cmtcontext into a single msgpack buffer viacmt_encode_msgpack_create()flb_input_chunk_append_raw()with the entire buffer as one unitn_records = 0for metrics, so the chunk system has no way to make record-level decisionsNo size checking, no splitting, no batching existed.
Fix
A size-aware splitting path is added to
input_metrics_append()insrc/flb_input_metric.c:Fast path (unchanged behavior): Encode the full
cmtcontext. If the encoded size is <= 2MB, append as-is. Zero behavioral change for small payloads.Slow path (new): When encoded size exceeds 2MB:
counters,gauges,untypeds,histograms,exp_histograms,summaries)families_per_batchproportionally:total_families * chunk_limit / encoded_size(minimum 1, using 64-bit arithmetic to avoid overflow on 32-bit platforms)cmt_cat_*()to copy each into a temporarycmtbatch contextStatic labels are copied to each batch. The splitting granularity is at the metric family level — individual families are never broken apart. If a single family still exceeds the chunk limit, a warning is logged since no further splitting is possible.
The fix is in
flb_input_metrics_append()so it benefits all metric inputs, not justprometheus_scrape.New helper functions
copy_static_labels()cmtcontextinput_metrics_split_and_append()Design decisions
uint64_t) for the batch size numerator to avoid overflow on 32-bit platforms (e.g., embedded ARM where Fluent Bit also runs)cfl_list_foreach_safeused for list iteration as a defensive measure in casecmt_cat_*()functions ever modify source list nodesgoto-based centralized error cleanup instead ofreturnfrom macro, making resource management explicit and debugger-friendlykvlistsare not copied to batches to avoid complexity; the primary consumers do not depend on per-chunk metadataTesting
All 81 internal tests pass (80 existing + 1 new test file).
New test file
tests/internal/input_metric.cwith 6 test cases:static_labels_copycmtcontextscat_single_countercmt_cat_counterproduces identical encoded outputbatch_split_preserves_familiesbatch_split_mixed_typesempty_contextcmthas 0 families in all 6 listsfamilies_per_batch_calculationFluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Tests