Skip to content

Reuse fileloader task data when atomic store fails#3191

Open
jakebailey wants to merge 1 commit intomainfrom
jabaile/faster-parse-task-allocs
Open

Reuse fileloader task data when atomic store fails#3191
jakebailey wants to merge 1 commit intomainfrom
jabaile/faster-parse-task-allocs

Conversation

@jakebailey
Copy link
Member

We use an atomic store to create a file's loader data. If this fails, we throw away the struct and map. This is a waste! We can totally use that for the next file. Stick these into a global sync.Pool, clearing the map (basically free for this small of a map) for later reuse.

The results for the testrunner package:

Summary (flat)

Metric Before After Reduction
alloc_objects 4,264,897 879,354 -79%
alloc_space 443 MB 80.5 MB -82%

Line-level breakdown (flat)

Line Before (objects) After (objects) Before (bytes) After (bytes)
LoadOrStore / struct alloc 1,054,132 0 96.5 MB 0
map[string]*parseTask{...} 2,162,094 0 250.5 MB 0
wg.Queue closure 1,048,671 879,354 96 MB 80.5 MB

Copilot AI review requested due to automatic review settings March 21, 2026 19:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces allocations in the compiler’s file parsing pipeline by reusing parseTaskData instances (and their small tasks maps) when SyncMap.LoadOrStore loses the race, instead of discarding the newly-allocated candidate.

Changes:

  • Added a global sync.Pool for parseTaskData to reuse candidate task data objects on failed LoadOrStore.
  • Updated filesParser.start to allocate candidates from the pool and return them when LoadOrStore indicates an existing entry was already present.

func getParseTaskData(task *parseTask) *parseTaskData {
td := parseTaskDataPool.Get().(*parseTaskData)
td.tasks[task.normalizedFilePath] = task
td.lowestDepth = math.MaxInt
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseTaskData instances are reused from a sync.Pool, but getParseTaskData/putParseTaskData don't reset all fields. If a pooled instance previously had startedSubTasks=true or a non-empty packageId, and it later becomes the stored value (LoadOrStore returns loaded==false), those stale values will change behavior (e.g. subtasks may be started too early, and packageId propagation can be skipped). Reset startedSubTasks and packageId (and any other stateful fields) when taking from the pool (or before putting back).

Suggested change
td.lowestDepth = math.MaxInt
td.lowestDepth = math.MaxInt
td.startedSubTasks = false
td.packageId = module.PackageId{}

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either we succeed and we don't put it back, or we fail, and so we just put back the version without having changed anything, so I do not believe this is true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants