Skip to content

Guard against late-arriving polls after worker shutdown#9330

Draft
rkannan82 wants to merge 5 commits intomainfrom
kannan/shutdown-worker-poll-guard
Draft

Guard against late-arriving polls after worker shutdown#9330
rkannan82 wants to merge 5 commits intomainfrom
kannan/shutdown-worker-poll-guard

Conversation

@rkannan82
Copy link
Contributor

@rkannan82 rkannan82 commented Feb 13, 2026

What changed?

When CancelOutstandingWorkerPolls is called, the WorkerInstanceKey is cached in a TTL cache (70s default). Any subsequent poll arriving with this key returns empty immediately, preventing task dispatch to a shutting-down worker.

Why?

This handles the edge case where a poll request was in-flight (already sent by SDK) when ShutdownWorker was called, arriving at the server after the cancellation logic has completed. Without this guard, such polls could receive tasks that would never be processed.

How did you test it?

  • built
  • covered by existing tests
  • added new unit test(s)

Potential risks

  • Memory usage: Cache stores up to 50K entries (~10MB, based on ~200 bytes per entry). TTL is 70s (long poll timeout + buffer). When full, LRU eviction removes oldest entries first.

When CancelOutstandingWorkerPolls is called, the WorkerInstanceKey is
cached in a TTL cache (60s default). Any subsequent poll arriving with
this key returns empty immediately, preventing task dispatch to a
shutting-down worker.

This handles the edge case where a poll request was in-flight (already
sent by SDK) when ShutdownWorker was called, arriving at the server
after the cancellation logic has completed.

- Add ShutdownWorkerCacheTTL dynamic config (60s default)
- Add shutdownWorkers TTL cache to matchingEngineImpl
- Check cache early in PollWorkflowTaskQueue/PollActivityTaskQueue
- Add unit tests for cache behavior
@rkannan82 rkannan82 requested review from a team as code owners February 13, 2026 23:18
@rkannan82 rkannan82 marked this pull request as draft February 13, 2026 23:26
@rkannan82 rkannan82 force-pushed the kannan/shutdown-worker-poll-guard branch from 441d5b6 to 5c1df05 Compare February 13, 2026 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant