Skip to content

[DX-3716] Fixes Flaky VRFv2 Tests + CI Core Timeouts#22324

Open
kalverra wants to merge 1 commit intodevelopfrom
fixVRFv2Flakes
Open

[DX-3716] Fixes Flaky VRFv2 Tests + CI Core Timeouts#22324
kalverra wants to merge 1 commit intodevelopfrom
fixVRFv2Flakes

Conversation

@kalverra
Copy link
Copy Markdown
Collaborator

@kalverra kalverra commented May 6, 2026

Results of using the chainlink-test-diagnosis skill on the core/services/vrf/v2/ package.

I primarily used claude with sonnet-4.6 | high effort. It took 2 sessions, totaling ~1 hour (not including diagnose runtime) and ~500k total tokens, costing approximately $7.50.

Initial Findings

go -C ./tools/test run . diagnose --iterations 100 --parallel-iterations 5 -- --timeout 10m ./core/services/vrf/v2/...

20% Flake Rate

image

Flaky Tests Identified & Fixed

Verified these tests did not flake again after running with 200 iterations. Also, no more instances of the nasty CI Core timeouts I discovered earlier.

  • TestMaliciousConsumer
  • TestStartHeartbeats/bhs_feeder_startheartbeats_happy_path
  • TestVRFV2Integration_CanceledSubForceFulfillmentRevertedTxn_Retry
  • TestVRFV2Integration_ReplayOldRequestsOnStartUp | CRE-2332
  • TestVRFV2Integration_SingleConsumer_BlockHeaderFeeder
  • TestVRFV2Integration_SingleConsumer_HappyPath_BatchFulfillment | DX-1745
  • TestVRFV2Integration_SingleConsumer_NeedsTrustedBlockhashStore
  • TestVRFV2PlusIntegration_Migration
  • TestVRFV2PlusIntegration_ReplayOldRequestsOnStartUp
  • TestVRFV2PlusIntegration_SingleConsumer_EIP150_Revert
  • TestVRFV2PlusIntegration_SingleConsumer_HappyPath_BatchFulfillment/link_payment | CRE-3205
  • TestVRFV2PlusIntegration_SingleConsumer_HappyPath_BatchFulfillment/native_payment | CRE-4094
  • TestVRFV2PlusIntegration_SingleConsumer_NeedsBlockhashStore/link_payment

What Was the Fix?

  • Increasing timeouts for some tests. This makes them a little slower, but better than flaky.
  • A lot of the tests utilized Filter functions for exploring on-chain events. The Filter function is async and can hit odd race conditions when used improperly (as many of the tests did). We either used more deterministic methods of examining on-chain events (transaction receipts) or implemented retry loops.

Why Don't We Have Flaky Test Tickets for All of These?

  • Timeouts: This package has a nasty, intermittent timeout bug that can mask errors from other tests, hiding their failures from our Trunk.io monitoring system.
  • Sample Size: We run the CI Core pipeline ~20-50 times a day. For my exploration and verification, I ran the tests 100+ times per round. We're bound to catch more rare flakes this way.

Copilot AI review requested due to automatic review settings May 6, 2026 15:35
@kalverra kalverra requested review from a team as code owners May 6, 2026 15:35
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

👋 kalverra, thanks for creating this pull request!

To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team.

Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks!

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

I see you updated files related to core. Please run make gocs in the root directory to add a changeset as well as in the text include at least one of the following tags:

  • #added For any new functionality added.
  • #breaking_change For any functionality that requires manual action for the node to boot.
  • #bugfix For bug fixes.
  • #changed For any change to the existing functionality.
  • #db_update For any feature that introduces updates to database schema.
  • #deprecation_notice For any upcoming deprecation functionality.
  • #internal For changesets that need to be excluded from the final changelog.
  • #nops For any feature that is NOP facing and needs to be in the official Release Notes for the release.
  • #removed For any functionality/config that is removed.
  • #updated For any functionality that is updated.
  • #wip For any change that is not ready yet and external communication about it should be held off till it is feature complete.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

✅ No conflicts with other open PRs targeting develop

@trunk-io
Copy link
Copy Markdown

trunk-io Bot commented May 6, 2026

Static BadgeStatic BadgeStatic BadgeStatic Badge

View Full Report ↗︎Docs

@cl-sonarqube-production
Copy link
Copy Markdown

@kalverra kalverra requested review from Fletch153, jmank88 and mchain0 May 6, 2026 17:02
@kalverra kalverra enabled auto-merge May 6, 2026 17:28
@kalverra kalverra requested review from Copilot and removed request for Copilot May 6, 2026 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants