feat: allow regex patterns in promql/aggregate keep and strip parameters#1690
feat: allow regex patterns in promql/aggregate keep and strip parameters#1690Nepomuk5665 wants to merge 2 commits intocloudflare:mainfrom
Conversation
This change implements issue cloudflare#509, allowing users to specify regex patterns in the keep and strip parameters for the promql/aggregate check. Changes: - Modified AggregationCheck to accept a regex pattern for label matching instead of a literal string - Added extractLiteralLabels helper to extract individual labels from simple alternation patterns like 'job|instance' - Updated config validation to verify regex patterns in keep/strip lists - For simple literal patterns, the behavior is fully backward compatible - For regex patterns, all labels in src.Labels that match the pattern are checked Example usage: aggregate ".+" { keep = ["job|instance"] # Preserve either job or instance labels } Fixes cloudflare#509
| // labelLiteralPattern matches simple label patterns that are either: | ||
| // - A single literal label name (e.g., "job") | ||
| // - An alternation of literal label names (e.g., "job|instance") | ||
| var labelLiteralPattern = regexp.MustCompile(`^[a-zA-Z_][a-zA-Z0-9_]*(\|[a-zA-Z_][a-zA-Z0-9_]*)*$`) |
There was a problem hiding this comment.
I'm not sure if I understand what this is trying to achieve.
This check already allows a list of names, so making values that are regexp which allows multiple values don't add anything.
This:
aggregate ".+" {
strip = ["job|instance"]
}
can already be done using:
aggregate ".+" {
strip = ["job", "instance"]
}
the ticket this references asks for:
aggregate ".+" {
strip = ["job_.+", ".*instance.*"]
}
There was a problem hiding this comment.
Thanks for the feedback! You're absolutely right - the job|instance alternation pattern was redundant since ["job", "instance"] already achieves the same result.
I've updated the implementation to:
- Remove the
extractLiteralLabelsfunction that handled alternation patterns - For true regex patterns like
job_.+or.*instance.*, iterate oversrc.Labelsand match against the regex - this enables the use case from Allow regex to keep for promql/aggregate #509 - For literal label names (no regex metacharacters), maintain backward compatibility by also checking
CanHaveLabelfor labels not explicitly in the query
The new tests demonstrate the regex matching behavior:
keep = ["job_.+"]will catch labels likejob_eu,job_usbeing removedstrip = [".*instance.*"]will catch labels likeenv_instance_idbeing present
Addresses review feedback by removing the redundant extractLiteralLabels function that handled patterns like 'job|instance'. Such alternation patterns don't add value since users can already specify multiple labels as a list ['job', 'instance']. Changes: - Remove extractLiteralLabels and labelLiteralPattern that handled alternation patterns - Add literalLabelName regex to detect single literal label names for backward compatibility with the original CanHaveLabel behavior - For true regex patterns (like 'job_.+' or '.*instance.*'), only match against labels explicitly mentioned in the query - For literal label names, maintain backward compatibility by also checking CanHaveLabel for labels not in src.Labels This enables the feature requested in cloudflare#509: using regex patterns like 'job_.+' to match multiple labels dynamically, while maintaining full backward compatibility for existing literal label configurations.
| // isLiteralLabelName returns true if the pattern is a valid Prometheus label name | ||
| // (contains no regex metacharacters). This is used to determine whether to use | ||
| // the original CanHaveLabel behavior for backward compatibility. | ||
| var literalLabelName = regexp.MustCompile(`^[a-zA-Z_][a-zA-Z0-9_]*$`) |
There was a problem hiding this comment.
I don't see how this is needed.
But I also feel that it's slightly awkward for me to comment here just so that you can paste that into whatever AI models/tools you're using to generate this PR. I can ask AI myself and cut the middle man out.
|
Hey, I apologize for the confusion and frustration. You're right to call this out. I'm a real person (Nepomuk), but I have been using AI tools to help me find and fix bugs across open source projects. I should have been more upfront about that. Looking at this PR more honestly - I think I misunderstood the issue #509. The original request was for regex matching against label names dynamically, but I'm not sure my implementation actually solves the real use case properly. If you'd prefer, I'm happy to close this PR. Or if you could point me in the right direction for what the implementation should actually look like, I'd be glad to take another crack at it - but I'll make sure I actually understand the problem first before submitting changes. Again, sorry for wasting your time with back-and-forth that wasn't productive. I appreciate you maintaining pint - it's a great tool. |
|
Closing this PR as the implementation doesn't properly address the issue. Thanks for your patience and feedback. If I have a better understanding of what's needed in the future, I may open a new PR. |
Summary
This PR implements issue #509, allowing users to specify regex patterns in the
keepandstripparameters for thepromql/aggregatecheck.Changes
AggregationCheckto accept a regex pattern for label matching instead of requiring literal stringsextractLiteralLabelshelper to extract individual labels from simple alternation patterns likejob|instancesrc.Labelsthat match the pattern are checkedExample Usage
Fixes #509