Add proposal for per-tenant cardinality API#7335
Open
CharlieTLe wants to merge 6 commits intocortexproject:masterfrom
Open
Add proposal for per-tenant cardinality API#7335CharlieTLe wants to merge 6 commits intocortexproject:masterfrom
CharlieTLe wants to merge 6 commits intocortexproject:masterfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com>
4 tasks
yeya24
reviewed
Mar 9, 2026
| message TSDBStatusResponse { | ||
| uint64 num_series = 1; | ||
| int64 min_time = 2; | ||
| int64 max_time = 3; |
Contributor
There was a problem hiding this comment.
Do we need min max? How do we aggregate this in the final response? min(min_t) and max(max_t)?
| | `labelValueCountByLabelName` | No | Portable to block storage | | ||
| | `seriesCountByLabelValuePair` | No | Portable to block storage | | ||
| | `memoryInBytesByLabelName` | **Yes** | In-memory byte usage has no analogue in object storage | | ||
| | `minTime` / `maxTime` | **Yes** | Reflects head time range, not total storage | |
Contributor
There was a problem hiding this comment.
Do we need to add those head specific fields?
…ore gateways Add source=blocks query parameter to analyze cardinality from compacted blocks in object storage. The blocks path fans out to store gateways, which compute statistics from block index headers (cheap label value counts) and posting list expansion (exact series counts per metric). Results are cached per immutable block. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com>
…plify Address feedback from PR cortexproject#7335 review: - Rename endpoint from /api/v1/status/tsdb to /api/v1/cardinality - Drop Prometheus compatibility as a goal - Add start/end time range query parameters - Drop head-specific fields (numLabelPairs, memoryInBytesByLabelName, minTime, maxTime) to unify response across both sources - Remove API Compatibility and Field Portability sections Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com>
…limit Make start/end required for source=blocks to prevent unbounded block scanning. Add cardinality_max_query_range per-tenant limit (default 24h) to give operators control over the blast radius. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com>
Critical:
- Fix blocks path aggregation: no SG RF division since GetClientsFor
routes each block to exactly one store gateway
Significant:
- Add min_time, max_time, block_ids to store gateway CardinalityRequest
- Specify MaxErrors=0 for head path with availability implications
- Add consistency check and retry logic for blocks path
- Document RF division as best-effort approximation
Moderate:
- Wrap responses in standard {status, data} Prometheus envelope
- Change HTTP 422 to HTTP 400 for limit violations
- Add Error Responses section with all validation scenarios
- Add approximated field for block overlap and partial results
- Add Observability section with metrics
- Add per-tenant concurrency limit and query timeout
- Reject start/end for source=head instead of silently ignoring
Low:
- Add Rollout Plan with phased approach and feature flag
- Document rolling upgrade compatibility (Unimplemented handling)
- Document Query Frontend bypass
- Improve caching: full results keyed by ULID, limit at response time
- Add missing files to implementation section
- Move shared proto to pkg/cortexpb/cardinality.proto
- Rename TSDBStatus* to Cardinality* throughout
- Add limit upper bound (max 512)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com>
friedrichg
approved these changes
Mar 16, 2026
Member
friedrichg
left a comment
There was a problem hiding this comment.
Looks like a great feature. thanks for working on this!
| - Zero read-time cost — statistics are available immediately from block metadata. | ||
| - The compactor already reads the full block index during compaction and validation (`GatherIndexHealthStats`). | ||
|
|
||
| **Cons:** |
Member
There was a problem hiding this comment.
I think one extra con here is that we would need to align with block ranges
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Proposal for a per-tenant cardinality API (
GET /api/v1/cardinality) that exposes cardinality statistics (top metrics by series count, top labels by value count, top label-value pairs by series count) across two data sources:source=head: Fans out to ingesters via the distributor, aggregates TSDB head stats with RF-based deduplication.source=blocks: Fans out to store gateways viaBlocksFinder+GetClientsFor, computes cardinality from block indexes with per-block caching.Key design points:
start/endrequired for blocks path, rejected for head path (head cannot sub-filter)cardinality_api_enabled,cardinality_max_query_range,cardinality_max_concurrent_requests,cardinality_query_timeout{status, data}Prometheus response envelope withapproximatedfield for block overlap / partial results🤖 Generated with Claude Code