Fix a task watching cancellation bug and a task fingerprinting bug#2740
Open
mkeeler wants to merge 2 commits intogo-task:mainfrom
Open
Fix a task watching cancellation bug and a task fingerprinting bug#2740mkeeler wants to merge 2 commits intogo-task:mainfrom
mkeeler wants to merge 2 commits intogo-task:mainfrom
Conversation
added 2 commits
March 12, 2026 12:38
…he watched directory Previously if a generated file was placed alongside the source file, the fsnotify event would be seen and trigger cancellation of all outstanding task runs. If the task were executing a for loop over its sources and calling other tasks, this would have the effect of preventing all of the child tasks from being executed. The fix here is to move ignoring fsnotify events earlier and prevent that context cancellation unless we know that we need to execute the tasks again anyways.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Task Fingerprinting Bug
The first commit in this PR fixes a bug where two task invocations (such as in a for loop) inadvertently where writing the checksum or timestamp files for the task to the same location even though the tasks were executed with different arguments causing them to have different sources.
Reproduction
echo 1 >1.in && echo 2 >2.intask copycopy:singleonce for each *.in fileecho 2.2 > 2.incopy:singletask twice again with neither showing as up to date.Because only 2.in was changed, I was expecting the task to show one copy:single task as up to date and then re-copy 2.in to 2.out.
Fix
Instead of writing out the checksum/timestamps to a single file within the respective directory, the task is first fingerprinted. So instead of the
copy:singletask here recording the checksum/timestamp in a singlecopy-singlefile, it will take a hash of the normalized task name, working directory of the task and the declared sources/generates and store the checksum incopy-single-<hash>. This allows each distinct invocation of the sub-task with different arguments to independently manage whether it is up to date.Task Watch Cancellation Bug
The pre-existing task watching code had a bug where once an event occurred it would spawn go routines to process all tasks in the background and continue the loop. If an event occurs, it would cancel the context used to run those previous go routines and restart everything. In some scenarios this works fine such as when the generated files do not reside within the same directory being watches. When the generated files do reside in the same directory, the first task generating its output causes an fsnotify event to be triggered which then cancels the context. This is racey, but if the tasks are longer running it can eventually cancel the task resulting in other sub-tasks not being executed. This doesn't result in an infinite loop because prior to executing the task the fingerprint is checked and updated to prevent subsequent runs.
The root cause of all of the bad behavior of not running the tasks to completion is that the context is cancelled when it shouldn't be (an fsnotify event comes in for something that is not one of the sources).
Reproduction
echo 1 >1.in && echo 2 >2.intask -w copycopy:singleonly once. It never gets around to executing the copy for the 2.in fileecho 2.2 > 2.incopy:singleonly once again.I would have expected step 2 to run
copy:singletwice but it doesn't due to the context being cancelled while in the firstcopy:singleinvocations sleep command is executing.I would also have expected step 3 cause copying to take place again. With the fix for the fingerprinting bug included, the first invocation should show as up to date and the second one would then run.
Fix
The fix was to move some logic to check the event against the sources out of the spawned go routines to execute the tasks and to where the event handling first starts. Because we check the events file against the list of sources before the context is cancelled, we can toss out irrelevant events and keep processing of the tasks going.