Skip to content

fix: judge concurrency safety#329

Open
gsonntag wants to merge 4 commits into
hackutd:masterfrom
gsonntag:fix/judge-concurrency-safety
Open

fix: judge concurrency safety#329
gsonntag wants to merge 4 commits into
hackutd:masterfrom
gsonntag:fix/judge-concurrency-safety

Conversation

@gsonntag
Copy link
Copy Markdown
Contributor

Description

Fixes concurrency bugs in judge assignment and scoring.
This updates /judge/next to re-read the judge inside the transaction before assigning a new project, so concurrent requests cannot assign multiple projects from a stale middleware snapshot. It also preserves the existing behavior of returning an already-assigned current project even when judging is paused or deliberation mode is enabled.
This updates /judge/finish to re-read the judge inside the transaction before scoring, so concurrent finish requests cannot duplicate the same project in seen_projects or double-increment project seen counts.
It also preserves Mongo transaction retry behavior by wrapping transaction errors with %w in UpdateAfterSeen, so transient transaction labels survive error wrapping.

Fixes

N/A

Type of Change

Delete options that do not apply:

  • Bug fix (change which fixes an issue)

Is this a breaking change?

  • Yes
  • No

@gsonntag gsonntag changed the title Fix/judge concurrency safety fix: judge concurrency safety Apr 28, 2026
@MichaelZhao21 MichaelZhao21 self-requested a review May 22, 2026 22:25
Comment thread server/router/judge.go
Comment on lines +443 to +445
state.Clock.Mutex.Lock()
running := state.Clock.State.Running
state.Clock.Mutex.Unlock()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary to lock the clock mutex here. Because the running state is a boolean, it shouldn't need to be locked for reading.

Comment thread server/router/judge.go
Comment on lines +731 to +735
// IMPORTANT: do NOT call ctx.JSON inside the callback. Mongo's
// session.WithTransaction retries the callback on
// TransientTransactionError (which includes WriteConflict). If we
// wrote the response on a doomed first attempt, a successful retry
// would still leave the client looking at the 500.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we don't need to explain all of this here; perhaps just including the explanation in the PR/commit is good enough and just adding a note to not call ctx.JSON in the callback. The same comment goes for the top of the callback for /judge/next.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants