[addon-operator] add queue head info metric and critical flag to module info#771
Draft
diyliv wants to merge 2 commits into
Draft
[addon-operator] add queue head info metric and critical flag to module info#771diyliv wants to merge 2 commits into
diyliv wants to merge 2 commits into
Conversation
2b91642 to
14ed834
Compare
Signed-off-by: diyliv <onlogn081@gmail.com>
Signed-off-by: diyliv <onlogn081@gmail.com>
14ed834 to
580232f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Adds two metrics that let us replace the flat
D8DeckhouseQueueIsHungalert with severity-differentiated alerts.New metric:
tasks_queue_head_infoA gauge (value=1) with labels
queue,module,task_type,hook. Published every 5 seconds for each non-empty queue. Old series are expired when the head changes -> no phantom metrics remain.Label cleanup:
ParallelModuleRunsynthetic names like "Parallel run for a, b, c" -> normalized to empty string (would otherwise produce a bad join withdeckhouse_mm_module_info)New label:
criticalondeckhouse_mm_module_infoValue
"true"or"false"fromBasicModule.GetCritical()(thecritical: trueproperty inmodule.yaml). Added additively -> existing queries are unaffected.Why it's needed
The old
D8DeckhouseQueueIsHungalert had two problems:With these two metrics, we can create three separate alerts:
D8DeckhouseQueueIsHungCriticalcritical="true"modulesD8DeckhouseQueueIsHungcritical="false"modulesD8DeckhouseQueueIsHungGlobalmodule="")