fix(posixage/flock): close ghost-lock race via inode verify + heartbeat#529
Merged
Conversation
When a stale lock file is recovered, the unlinking process leaves any existing flock holder pointing at an orphaned inode while the next caller flocks a fresh inode created at the same path. Both callers then believe they hold exclusive access. recoverStaleLock also lost the race between its Stat and Remove calls, surfacing a spurious lock-acquisition failure when another caller had already unlinked the file. acquireOnce now opens, flocks, compares the locked fd's inode to the inode currently at the lock-file path (via os.SameFile), and only then truncates to refresh modtime. On mismatch it releases the lock and returns errStaleInode so the retry loop picks up a fresh fd against whatever file is now at the path. retryLock loops acquireOnce instead of retrying flock against a stale fd. recoverStaleLock is now path-based and treats ENOENT on Stat or Remove as success, closing the stat/remove TOCTOU window. This does not protect a long-running holder whose modtime ages past 30s without a heartbeat — that race is documented in skipped tests and would need a holder-side modtime refresh to fully close. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Assessment: 🔴 CRITICAL
Note: Verifier returned an empty response (anti-loop rule applied). Findings below are from the drafter, manually reviewed against source. High-confidence issues only are reported.
This PR correctly fixes the core ghost-lock race on the acquiring side using /. However, there is a residual TOCTOU window in itself and a silently swallowed truncate error that can re-expose a valid lock to stale-recovery removal.
Close the holder-side half of the ghost-lock race. The prior commit verified the locked inode at acquisition time, which protected new acquirers but left long-running holders vulnerable: any caller whose critical section exceeded the 30s stale threshold would be hijacked when another process unlinked the lock file and created a fresh inode at the same path. A goroutine started by tryLock now re-truncates the lock file every 10s for the duration of the hold, keeping its modtime young enough that concurrent recovery callers never misclassify it as abandoned. Each tick also re-verifies that the fd's inode still matches the path; a mismatch indicates the heartbeat itself was starved past the stale threshold and we have been hijacked, which is logged via log/slog. The UnlockFunc stops the goroutine and waits for it to exit before closing the fd, so the heartbeat can never operate on a closed descriptor. Also swap the order of truncate and inode verification in acquireOnce so the modtime refresh is committed before the inode check, narrowing the window between a passing check and a concurrent unlink. Tests: - Drop both t.Skip() placeholder tests; the limitations they documented are now fixed by the heartbeat. - Drop the "two waiters cannot both acquire after recovery" test; with a live raw-flock holder it never exercised the ghost-lock code path, so it gave false confidence. - Add a heartbeat smoke test that shortens both intervals via the package-private vars and verifies recoverStaleLock refuses to recover a heartbeating holder after several stale-thresholds of wall time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Document that the returned UnlockFunc owns both the file descriptor and the background heartbeat goroutine, and that callers must invoke it exactly once (typically via defer) to avoid leaking either. Add idiomatic examples on TryLock and TryRLock, and note the heartbeat-stop-before- close ordering on UnlockFunc itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace log/slog with the project's logging.Logger interface so the library no longer writes to a default sink. The heartbeat goroutine resolves its logger via logging.FromContext, falling back to a noop when the caller hasn't set one. posixage's fileStore plumbs its configured logger through with logging.WithLogger so the existing WithLogger option on store.New propagates transparently without any new parameters on TryLock/TryRLock. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joe0BAB
previously approved these changes
May 28, 2026
Signed-off-by: Alano Terblanche <18033717+Benehiko@users.noreply.github.com>
joe0BAB
approved these changes
May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #530.
Summary
The
posixageflock layer had a race where two callers could simultaneously believe they held the exclusive lock. Stale-recovery unlinked the lock file while another process still heldflock(2)on the now-orphaned inode, then created a fresh inode at the same path and flocked that.The race has two halves:
This PR closes both halves and fixes a TOCTOU in
recoverStaleLockbetween itsStatandRemovecalls. There are now two commits:flock, verify (viaos.SameFile) that the locked descriptor still refers to the file at the lock-file path. If not, drop the bad lock and retry. This closes the acquirer-side half and the recovery TOCTOU.tryLockre-truncates the lock file every 10s and re-verifies the locked inode each tick. This closes the holder-side half by keeping the lock file young enough that concurrent recovery callers never see it as stale.The bug
recoverStaleLockdeletes the lock file based purely on its modification time being >30s old:flockon inode₁ (legitimate, slow operation).root.Remove(lockFileName). On Unix this unlinks the dirent but keeps the inode alive for any open fd — A's flock is still kernel-valid.The 30s threshold was always a kludge:
flockis automatically released by the kernel on process death, so a held flock by definition belongs to a live holder. Removing the file does nothing to that holder; it only creates the inode-swap hazard.A secondary bug lived in the recovery TOCTOU: two callers reaching
recoverStaleLockconcurrently both saw a stale modtime, both calledRemove, the loser gotENOENT, andtryLocksurfaced that as a hard lock-acquisition failure even though recovery had succeeded.Fix part 1 — inode verification on acquire
acquireOnce(root, exclusive)runs the full open → flock → truncate → verify sequence atomically:isCurrentLockFilereturnsfalsewhen the path resolves to a different inode than the locked fd, or when the path no longer exists. Either condition means a concurrent recovery unlinked the file between ouropenand ourflock— we drop the bad lock and let the retry loop start over against whatever is now at the path.retryLocknow loopsacquireOnceinstead of retryingflockagainst a stale fd, so each retry gets a fresh descriptor.recoverStaleLocklost its*os.Fileparameter (it now works purely by path) and treatsENOENTon bothStatandRemoveas success — closing the TOCTOU window without a mutex.To make the release-without-close step possible,
releaseLockwas split out fromunlockFileon both Unix and Windows.unlockFilestill releases + closes, so external callers' contracts are unchanged.Fix part 2 — modtime heartbeat for holders
tryLocknow starts a goroutine on successful acquisition that re-truncates the lock file everyheartbeatInterval(10s) until unlock. Each tick also re-verifies the locked descriptor's inode against the path; a mismatch indicates the heartbeat itself was starved paststaleThresholdand the lock has been hijacked, which is logged vialog/slog.The
UnlockFuncreturned to the caller cancels the goroutine's context and waits for it to exit before closing the fd, so the heartbeat can never operate on a closed descriptor.heartbeatInterval(10s) andstaleThreshold(30s) are package-privatevars rather than consts so tests can shorten them. Production callers cannot tune them; the 3× safety ratio is fixed.Hijack detection without in-band failure
When the heartbeat detects an inode mismatch, the holder's outstanding operation has no way to be cancelled in-band —
UnlockFuncis the only callback we hold. The mismatch is therefore logged but the holder keeps running. This is a deliberate trade-off: the heartbeat fires every 10s vs. a 30s threshold, so on a non-pathological system the hijack window is effectively closed. The log line is the safety net for SIGSTOP / extreme GC pause / scheduler starvation.Test plan
go test ./store/posixage/... -race -count=1passesgo vet ./store/posixage/...cleangolangci-lint run ./store/posixage/...reports 0 issuesflock_race_test.gocovers:isCurrentLockFilecorrectly identifies an unlinked pathisCurrentLockFilecorrectly identifies inode replacementrecoverStaleLockcalls both return success (TOCTOU fix)TestFlockandTestRecoverLocksuites still passWhat was dropped during review
t.Skip()placeholder tests that documented the now-fixed holder-side limitations.flockon the unlinked orphan always returnsEWOULDBLOCK, so the test never exercised the ghost-lock path it claimed to cover. It gave false confidence and is removed rather than left as theatre.🤖 Generated with Claude Code