Fix job execution duration when runner assign time is not set#4472
Fix job execution duration when runner assign time is not set#4472nikola-jokic wants to merge 2 commits intomasterfrom
Conversation
|
Hello! Thank you for your contribution. Please review our contribution guidelines to understand the project's testing and code conventions. |
There was a problem hiding this comment.
Pull request overview
Fixes an edge case in gha_job_execution_duration_seconds where jobs completed before runner assignment could yield an epoch-sized duration.
Changes:
- Add a
RunnerAssignTime.IsZero()check when computing job execution duration. - Record an execution duration of
0when the runner was never assigned.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| var executionDuration int64 | ||
| if !msg.RunnerAssignTime.IsZero() { | ||
| executionDuration = msg.FinishTime.Unix() - msg.RunnerAssignTime.Unix() | ||
| } | ||
|
|
||
| e.observeHistogram(MetricJobExecutionDurationSeconds, l, float64(executionDuration)) |
There was a problem hiding this comment.
This change fixes a production-facing metric edge case, but there are currently no unit tests covering RecordJobCompleted. Please add a test for the zero RunnerAssignTime scenario to ensure we don’t regress back to recording an epoch-sized duration (and to lock in the intended behavior: either skipping the observation or recording a specific value).
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if !msg.RunnerAssignTime.IsZero() { | ||
| return | ||
| } | ||
|
|
||
| l := e.completedJobLabels(msg) | ||
| e.incCounter(MetricCompletedJobsTotal, l) |
There was a problem hiding this comment.
The RunnerAssignTime.IsZero() guard is inverted and causes RecordJobCompleted to return for the normal (non-zero) case while still observing the huge epoch-based duration for the zero-value case. Also, because the return happens before completedJobLabels/gha_completed_jobs_total increment, this change would stop counting completed jobs entirely when RunnerAssignTime is set.
Flip the condition to return only when RunnerAssignTime is zero, and move the early return to after incrementing MetricCompletedJobsTotal so canceled-before-assignment jobs are still counted but duration observation is skipped.
| if !msg.RunnerAssignTime.IsZero() { | |
| return | |
| } | |
| l := e.completedJobLabels(msg) | |
| e.incCounter(MetricCompletedJobsTotal, l) | |
| l := e.completedJobLabels(msg) | |
| e.incCounter(MetricCompletedJobsTotal, l) | |
| if msg.RunnerAssignTime.IsZero() { | |
| return | |
| } |
| if !msg.RunnerAssignTime.IsZero() { | ||
| return | ||
| } | ||
|
|
||
| l := e.completedJobLabels(msg) | ||
| e.incCounter(MetricCompletedJobsTotal, l) |
There was a problem hiding this comment.
This change introduces new branching behavior in RecordJobCompleted (skip observation when RunnerAssignTime is unset) but there are no unit tests asserting the expected metric effects. Please add a test that covers both cases:
- non-zero
RunnerAssignTime: incrementsgha_completed_jobs_totaland observes a reasonablegha_job_execution_duration_secondsvalue - zero
RunnerAssignTime: still incrementsgha_completed_jobs_totalbut does not observegha_job_execution_duration_seconds
This will prevent regressions like an accidentally inverted IsZero() check.
| if !msg.RunnerAssignTime.IsZero() { | |
| return | |
| } | |
| l := e.completedJobLabels(msg) | |
| e.incCounter(MetricCompletedJobsTotal, l) | |
| l := e.completedJobLabels(msg) | |
| e.incCounter(MetricCompletedJobsTotal, l) | |
| if msg.RunnerAssignTime.IsZero() { | |
| return | |
| } |
83f6619 to
74bce34
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fixes #4444
Fixes #3731