input_metric: split large metric contexts into chunk-sized batches by thewillyhuman · Pull Request #11554 · fluent/fluent-bit

thewillyhuman · 2026-03-13T17:15:07Z

Addresses #9653

Summary

Add size-aware batch splitting to flb_input_metrics_append() so large metric payloads are automatically split into multiple chunks respecting the 2MB chunk size limit
Add unit tests for the splitting logic (6 test cases)

Problem

When using input plugins that collect a large number of metrics in a single operation (e.g., prometheus_scrape with 200K+ metrics), flb_input_metrics_append() encodes the entire CMetrics context into one monolithic msgpack buffer and writes it as a single chunk.

This chunk can far exceed the 2MB chunk size limit (FLB_INPUT_CHUNK_FS_MAX_SIZE), causing downstream plugins — notably out_opentelemetry — to fail when processing the oversized payload.

The existing 2MB chunk limit only locks a chunk after it is already oversized (preventing future writes to it), but does nothing to prevent a single write from exceeding it in the first place. There was simply no mechanism to split a large metric payload across multiple chunks.

This did not affect in_prometheus_remote_write because the sender controls batch size — each HTTP POST is a separate flb_input_metrics_append() call that naturally produces small chunks. But any input that collects all metrics in one shot (prometheus_scrape, node_exporter_metrics, etc.) is affected.

Root cause

flb_input_metrics_append() unconditionally:

Encodes the full cmt context into a single msgpack buffer via cmt_encode_msgpack_create()
Calls flb_input_chunk_append_raw() with the entire buffer as one unit
Passes n_records = 0 for metrics, so the chunk system has no way to make record-level decisions

No size checking, no splitting, no batching existed.

Fix

A size-aware splitting path is added to input_metrics_append() in src/flb_input_metric.c:

Fast path (unchanged behavior): Encode the full cmt context. If the encoded size is <= 2MB, append as-is. Zero behavioral change for small payloads.

Slow path (new): When encoded size exceeds 2MB:

Count total metric families across all 6 type lists (counters, gauges, untypeds, histograms, exp_histograms, summaries)
Estimate families_per_batch proportionally: total_families * chunk_limit / encoded_size (minimum 1, using 64-bit arithmetic to avoid overflow on 32-bit platforms)
Iterate all families using cmt_cat_*() to copy each into a temporary cmt batch context
When a batch reaches the target family count, encode to msgpack and append as its own chunk
Flush any remaining partial batch at the end

Static labels are copied to each batch. The splitting granularity is at the metric family level — individual families are never broken apart. If a single family still exceeds the chunk limit, a warning is logged since no further splitting is possible.

The fix is in flb_input_metrics_append() so it benefits all metric inputs, not just prometheus_scrape.

New helper functions

Function	Purpose
`copy_static_labels()`	Copies static labels from source to batch `cmt` context
`input_metrics_split_and_append()`	Core splitting logic: counts families, estimates batch size, iterates and flushes batches

Design decisions

64-bit arithmetic (uint64_t) for the batch size numerator to avoid overflow on 32-bit platforms (e.g., embedded ARM where Fluent Bit also runs)
cfl_list_foreach_safe used for list iteration as a defensive measure in case cmt_cat_*() functions ever modify source list nodes
goto-based centralized error cleanup instead of return from macro, making resource management explicit and debugger-friendly
Partial-append-on-error: if a mid-batch failure occurs, previously appended batches are NOT rolled back. This is acceptable because metrics consumers (Prometheus, OTLP) handle partial scrapes gracefully
No metadata deep-copy: internal/external kvlists are not copied to batches to avoid complexity; the primary consumers do not depend on per-chunk metadata

Testing

All 81 internal tests pass (80 existing + 1 new test file).

New test file tests/internal/input_metric.c with 6 test cases:

Test	What it verifies
`static_labels_copy`	Static labels are correctly copied between `cmt` contexts
`cat_single_counter`	`cmt_cat_counter` produces identical encoded output
`batch_split_preserves_families`	20 families split into batches of 5 -> 4 batches, all 20 recovered via encode-decode round-trip
`batch_split_mixed_types`	Mixed counters + gauges split correctly across batch boundaries
`empty_context`	Empty `cmt` has 0 families in all 6 lists
`families_per_batch_calculation`	Batch size formula correctness including edge cases (single family, large counts)

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

New Features
- Automatic splitting of oversized metric batches to stay within encoding limits, preventing dropped data.
- Preservation of static labels and metric metadata across split batches so metrics remain consistent.
- Robust per-batch processing that continues appending successfully encoded batches even if a later batch fails.
Tests
- Comprehensive unit tests covering batch splitting, mixed metric types, label/metadata preservation, empty contexts, and batch-size calculations.

coderabbitai · 2026-03-13T17:15:34Z

📝 Walkthrough

Walkthrough

Adds internal batch-splitting logic to the input metrics path to handle oversized encoded metric batches by copying static labels/metadata and iteratively creating, encoding, and appending smaller batches; also adds unit tests exercising copy, splitting, mixed-type batching, and size calculations.

Changes

Cohort / File(s)	Summary
Core Batching Implementation `src/flb_input_metric.c`	Introduces static helpers to copy static labels/metadata, adds `input_metrics_split_and_append` to build/encode/append multiple batches when encoded size exceeds chunk limit, refactors `input_metrics_append` to select fast/slow paths, adds `PROCESS_METRIC_LIST` macro, adjusts out_context lifecycle and logging.
Test Build `tests/internal/CMakeLists.txt`	Adds `tests/internal/input_metric.c` to `UNIT_TESTS_FILES` so new tests are built.
Unit Tests `tests/internal/input_metric.c`	New tests for static label copying, single-family encode/decode, batch splitting (homogeneous and mixed types), empty context behavior, and families-per-batch size calculations using msgpack validation.

Sequence Diagram

sequenceDiagram
    participant Input as Input Metrics
    participant Append as input_metrics_append
    participant Fast as Fast Path
    participant Split as input_metrics_split_and_append
    participant Builder as Batch Builder (tmp cmt)
    participant Encoder as Encoder (msgpack)
    participant Out as Output Context

    Input->>Append: request append metric batch
    Append->>Encoder: attempt encode whole batch
    alt encoded size <= chunk
        Encoder-->>Append: encoded data
        Append->>Fast: fast append to Out
        Fast->>Out: append chunk
        Out-->>Append: success
    else encoded size > chunk
        Encoder-->>Append: over limit
        Append->>Split: split and append
        Split->>Builder: compute families_per_batch & create batch contexts
        loop for each batch
            Builder->>Encoder: encode batch families
            Encoder-->>Builder: encoded chunk
            Builder->>Out: append encoded chunk (copy labels/metadata)
            Out-->>Builder: success/error (no rollback)
            Builder->>Builder: destroy batch context
        end
        Split-->>Append: return aggregated result
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

input_metrics: discard cmetrics with no series #10700: Modifies metric encoding/append flow in src/flb_input_metric.c, touching similar context/encoding behaviors and out_context handling.

Suggested labels

backport to v4.2.x

Suggested reviewers

edsiper
cosmo0920

"I’m a rabbit, nibbling code so neat,
batches split in tidy rows, a treat,
labels hop along, preserved and bright,
tests verify each little flight,
now metrics dance into the night." 🐇✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding batch splitting for oversized metric contexts to respect chunk size limits.
Docstring Coverage	✅ Passed	Docstring coverage is 90.91% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b580c0b001

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-13T17:21:38Z

src/flb_input_metric.c

+                              "could not create batch cmt context");          \
+                goto error;                                                   \
+            }                                                                 \
+            ret = copy_static_labels(batch, src);                             \


Preserve metadata when splitting oversized metric contexts

Each split batch is created from cmt_create() and only static labels are copied (copy_static_labels), so the original internal_metadata/external_metadata are dropped whenever the slow-path splitter is used. That is a functional regression for large contexts because OTLP encoding relies on cmt->external_metadata (resource/scope/data-point metadata) to preserve grouping and attributes, so oversized OpenTelemetry payloads lose metadata that is preserved for non-split payloads.

Useful? React with 👍 / 👎.

Fixed. Added copy_metadata() helper that deep-copies both internal_metadata and external_metadata kvlists to each batch context. It iterates each kvpair and copies by variant type (string, int64, uint64, double, bool, bytes). This is now called alongside copy_static_labels() when creating each batch, so OTLP resource/scope metadata is preserved across splits.

src/flb_input_metric.c

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

tests/internal/input_metric.c (1)
127-223: The split tests don't exercise the production path.

Both tests hand-roll the batching loop instead of calling flb_input_metrics_append(), and they only count recovered families. That misses regressions in the real helper such as dropped static labels/metadata, append-path cleanup, or batches that still exceed FLB_INPUT_CHUNK_FS_MAX_SIZE. Please add at least one end-to-end test with uneven family sizes and assert every emitted chunk stays within the limit.

Also applies to: 231-355
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/internal/input_metric.c` around lines 127 - 223, The test currently
hand-rolls batching logic in test_batch_split_preserves_families() instead of
exercising the real production helper; update or add an end-to-end test that
builds a src cmt with uneven family sizes and then calls
flb_input_metrics_append() (the real append/splitting path) to produce chunks,
verify every emitted chunk size is <= FLB_INPUT_CHUNK_FS_MAX_SIZE and that
static labels/metadata from the original counters are preserved in each decoded
chunk (use cmt_decode_msgpack_create to inspect chunks), and assert the total
recovered families equals the source; target symbols: flb_input_metrics_append,
FLB_INPUT_CHUNK_FS_MAX_SIZE,
cmt_encode_msgpack_create/cmt_decode_msgpack_create, and the test function
test_batch_split_preserves_families (or add a new test function) so the real
append logic is exercised.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/flb_input_metric.c`:
- Around line 149-158: The new batch creation path (inside the batch == NULL
branch where cmt_create() is called and copy_static_labels(batch, src) is
invoked) currently drops cmt-level internal/external metadata present on src
before cmt_cat_*() is used, causing payload divergence with the fast path; fix
this by copying the src cmt-level KV lists into the new batch after cmt_create()
(or detect their presence and fail early) — add a call to copy the
internal/external cmt metadata into batch (or bail out with flb_plg_error and
goto error when such metadata exists) so that batch contains the same cmt-level
metadata as src prior to any cmt_cat_*() operations.
- Around line 117-125: The current heuristic computes families_per_batch from a
global average (numerator/total_encoded_size) which can still produce mt_size >
FLB_INPUT_CHUNK_FS_MAX_SIZE for skewed family sizes; replace that estimate with
an actual-encoded-size loop: start with families_per_batch = 0 (but ensure at
least 1), then iteratively encode/measure the batch payload (using the same
encoding path that produces mt_size) adding one family at a time until adding
the next family would exceed FLB_INPUT_CHUNK_FS_MAX_SIZE, and use that count for
the batch; reference and remove reliance on numerator/total_encoded_size and
ensure the mt_size check is done proactively before appending the chunk rather
than only warning after append.

In `@tests/internal/input_metric.c`:
- Around line 105-112: The test only asserts size_src == size_dst which doesn't
prove cmt_cat_counter() copied data correctly; instead decode and compare
payloads or decoded structures: after calling cmt_encode_msgpack_create(src,
&buf_src, &size_src) and cmt_encode_msgpack_create(dst, &buf_dst, &size_dst) use
the msgpack decoder used in tests (or a deterministic byte-for-byte comparison
if the encoder is deterministic) to decode buf_dst and buf_src and assert that
the decoded family/counter entries (labels, sample values, timestamps) are
equal; alternatively, decode buf_dst and validate the specific counter
family/metric values expected from the source to ensure cmt_cat_counter()
preserved labels and samples rather than relying on size equality.

---

Nitpick comments:
In `@tests/internal/input_metric.c`:
- Around line 127-223: The test currently hand-rolls batching logic in
test_batch_split_preserves_families() instead of exercising the real production
helper; update or add an end-to-end test that builds a src cmt with uneven
family sizes and then calls flb_input_metrics_append() (the real
append/splitting path) to produce chunks, verify every emitted chunk size is <=
FLB_INPUT_CHUNK_FS_MAX_SIZE and that static labels/metadata from the original
counters are preserved in each decoded chunk (use cmt_decode_msgpack_create to
inspect chunks), and assert the total recovered families equals the source;
target symbols: flb_input_metrics_append, FLB_INPUT_CHUNK_FS_MAX_SIZE,
cmt_encode_msgpack_create/cmt_decode_msgpack_create, and the test function
test_batch_split_preserves_families (or add a new test function) so the real
append logic is exercised.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8bcc8f49-d66f-4815-bd0c-29549da5f423

📥 Commits

Reviewing files that changed from the base of the PR and between acad02a and b580c0b.

📒 Files selected for processing (3)

src/flb_input_metric.c
tests/internal/CMakeLists.txt
tests/internal/input_metric.c

src/flb_input_metric.c

tests/internal/input_metric.c

Addresses fluent#9653 Large metric scrapes (e.g., 200K+ metrics from prometheus_scrape) were encoded as a single chunk exceeding the 2MB limit, breaking downstream plugins like out_opentelemetry. Add a size-aware split path to flb_input_metrics_append() that batches metric families into chunk-sized cmt contexts using cmt_cat_*() before encoding. Small payloads are unaffected (fast path preserved). Signed-off-by: Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (1)

src/flb_input_metric.c (1)

108-111: Consider logging when unsupported kvlist types are skipped.

Silently skipping arrays and nested kvlists may cause subtle data loss if OTLP resource/scope metadata uses these types. A debug-level log could help with troubleshooting when expected metadata is missing.

💡 Suggested enhancement

         default:
-            /* Skip unsupported types (arrays, nested kvlists, references) */
-            ret = 0;
+            /* Skip unsupported types (arrays, nested kvlists, references).
+             * Log at trace level for debugging metadata preservation issues. */
+            flb_trace("[copy_kvlist] skipping unsupported type %d for key '%s'",
+                      pair->val->type, pair->key);
+            ret = 0;
             break;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/flb_input_metric.c` around lines 108 - 111, In the switch's default
branch (where it currently does "ret = 0;" and breaks) add a debug-level log
that notes an unsupported kvlist type was skipped, include context like the key
name and the type encountered; update the handling in the same function in
src/flb_input_metric.c (the default switch case that sets ret = 0) to call the
module logger at debug level before returning so operators can trace missing
OTLP resource/scope metadata without changing behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/flb_input_metric.c`:
- Around line 108-111: In the switch's default branch (where it currently does
"ret = 0;" and breaks) add a debug-level log that notes an unsupported kvlist
type was skipped, include context like the key name and the type encountered;
update the handling in the same function in src/flb_input_metric.c (the default
switch case that sets ret = 0) to call the module logger at debug level before
returning so operators can trace missing OTLP resource/scope metadata without
changing behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e3355b91-6230-4e5a-9f96-09fb91a72661

📥 Commits

Reviewing files that changed from the base of the PR and between b580c0b and eb84b31.

📒 Files selected for processing (3)

src/flb_input_metric.c
tests/internal/CMakeLists.txt
tests/internal/input_metric.c

🚧 Files skipped from review as they are similar to previous changes (1)

tests/internal/CMakeLists.txt

thewillyhuman requested review from cosmo0920 and edsiper as code owners March 13, 2026 17:15

github-actions bot added the docs-required label Mar 13, 2026

chatgpt-codex-connector bot reviewed Mar 13, 2026

View reviewed changes

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

src/flb_input_metric.c Show resolved Hide resolved

src/flb_input_metric.c Show resolved Hide resolved

tests/internal/input_metric.c Outdated Show resolved Hide resolved

thewillyhuman force-pushed the gfacundo/fix-metric-chunk-splitting branch from b580c0b to eb84b31 Compare March 13, 2026 17:42

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input_metric: split large metric contexts into chunk-sized batches#11554

input_metric: split large metric contexts into chunk-sized batches#11554
thewillyhuman wants to merge 1 commit intofluent:masterfrom
thewillyhuman:gfacundo/fix-metric-chunk-splitting

thewillyhuman commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 13, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 13, 2026

Uh oh!

thewillyhuman Mar 13, 2026

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thewillyhuman commented Mar 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Root cause

Fix

New helper functions

Design decisions

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

thewillyhuman Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thewillyhuman commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 13, 2026 •

edited

Loading