kola/harness: fix rerun success logic#4559
Conversation
When tests pass on rerun but lack allow-rerun-success tags, preserve the original error. When rerun fails, use the rerun error. Only reset to success when rerun passes AND tests have required tags. Issue: coreos#4546
There was a problem hiding this comment.
Code Review
This pull request refactors the test rerun logic in mantle/kola/harness.go to explicitly handle four scenarios based on rerun results and success tags. The feedback suggests adding a log message for the scenario where a rerun passes but the original failure is preserved due to missing tags, which helps avoid user confusion when seeing a 'PASS' followed by an overall 'FAIL'.
| if reRunErr != nil { | ||
| // Scenarios 2 & 4: Rerun failed - use the rerun error regardless of tags | ||
| runErr = reRunErr | ||
| } else if allTestsAllowRerunSuccess(testsToRerun, rerunSuccessTags) { | ||
| // Scenario 1: Rerun passed AND tests have tags to allow rerun success | ||
| runErr = nil // reset to success since all tests allowed rerun success | ||
| numFailedTests = 0 // zero out the tally of failed tests | ||
| } else { | ||
| runErr = reRunErr | ||
| } | ||
|
|
||
| // else: Scenario 3 - keep original runErr (test passed on rerun but doesn't allow rerun success) |
There was a problem hiding this comment.
When Scenario 3 occurs (tests pass on rerun but lack the necessary tags), the output might be confusing to users because they will see a PASS for the rerun followed by an overall FAIL for the test suite. Adding an explicit log message here would clarify why the original failure is being preserved despite the successful rerun.
if reRunErr != nil {
// Scenarios 2 & 4: Rerun failed - use the rerun error regardless of tags
runErr = reRunErr
} else if allTestsAllowRerunSuccess(testsToRerun, rerunSuccessTags) {
// Scenario 1: Rerun passed AND tests have tags to allow rerun success
runErr = nil // reset to success since all tests allowed rerun success
numFailedTests = 0 // zero out the tally of failed tests
} else {
// Scenario 3: Rerun passed but tags don't allow success.
fmt.Printf("Tests passed on rerun but lack tags to allow rerun success; original failure stands.\n")
}| runErr = reRunErr | ||
| } | ||
|
|
||
| // else: Scenario 3 - keep original runErr (test passed on rerun but doesn't allow rerun success) |
There was a problem hiding this comment.
Why would we want this behaviour ? That feels like a waste to throw an error if a rerun worked.
This scenario would likely just end up in someone notice it and go add the rerun tag.
There was a problem hiding this comment.
That's how this feature is supposed to work: return SUCCESS only when the tagged test(s) pass on rerun. When this feature was added, mostly the needs-internet tests were flaky, so this tag is hardcoded in the pipeline. But we do rerun any failed test, as far as I remember, to gather some statistics on which other tests and tags are also flaky. If they succeed on rerun, we would otherwise not be notified about failures from the first run.
When tests pass on rerun but lack allow-rerun-success tags, preserve the original error. When rerun fails, use the rerun error. Only reset to success when rerun passes AND tests have required tags.
Issue: #4546