Skip to content

Use linear search within unique label validation#887

Open
geekswaroop wants to merge 1 commit intoprometheus:mainfrom
geekswaroop:startLabel-linear-search
Open

Use linear search within unique label validation#887
geekswaroop wants to merge 1 commit intoprometheus:mainfrom
geekswaroop:startLabel-linear-search

Conversation

@geekswaroop
Copy link

@geekswaroop geekswaroop commented Mar 15, 2026

@roidelapluie @gotjosh

Unique label validation was added in #263

Couple of our prometheus scraper services was spending lots of time on this function, mostly on map rehashing and growth. Profiling revealed that the map used within the unique label validation is contributing to most of this CPU usage.

The duplicate label check in startLabelName allocates a new map[string]struct{} on every call and startLabelName is called once per label, not once per metric line. For a metric with N labels, this creates N throwaway maps with resulting in O(N²) map operations per metric line.

This PR replaces the map-based check with a linear scan of currentLabelPairs before appending. Since the check runs incrementally on each label addition, two existing labels can never be duplicates because only the new label needs to be compared against previous ones. This preserves the same early-return error behavior.

Benchmark

The current test data here has 2-3 labels, and no regression was observed. I added a separate benchmark with 7+ labels (representative of real world metrics emitted) and I see the following.

% benchstat many_labels_before.txt many_labels_after.txt
goos: linux
goarch: amd64
pkg: github.com/prometheus/common/expfmt
cpu: AMD EPYC 7B13
                      │ many_labels_before.txt │        many_labels_after.txt        │
                      │         sec/op         │   sec/op     vs base                │
ParseTextManyLabels-48              671.0µ ± 2%   535.3µ ± 2%  -20.23% (p=0.000 n=10)

                      │ many_labels_before.txt │     many_labels_after.txt      │
                      │          B/op          │     B/op      vs base          │
ParseTextManyLabels-48             219.5Ki ± 0%   219.5Ki ± 0%  ~ (p=0.209 n=10)

                      │ many_labels_before.txt │      many_labels_after.txt      │
                      │       allocs/op        │  allocs/op   vs base            │
ParseTextManyLabels-48              8.276k ± 0%   8.276k ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant