[Scripts] Share TOGSim binary and add codegen cache helper for worktrees#232
Merged
Conversation
TOGSim is a standalone C++ simulator that rarely changes alongside Python frontend work, so building it once per worktree is wasteful (~10 minutes per fresh worktree). setup_worktree.sh now symlinks $REPO_ROOT/TOGSim/build/bin/Simulator into the newly created worktree when the source is present and executable, falling back to a hint if the source is absent so the user notices instead of silently producing a half-set-up worktree. The link is resolved with readlink -f so it points at the real binary, not a chain of worktree-to-worktree symlinks. Simulator path resolution in Simulator/simulator.py is rooted at TORCHSIM_DIR already, so each worktree still has its own resolution and a worktree that does change TOGSim C++ can drop the symlink and rebuild in place. docs/worktrees.md gets the matching section.
After editing anything that affects emitted MLIR or wrapper code (PyTorchSimFrontend/mlir/*, lowering rules, codegen backend), the next torch.compile run replays the previously cached compile from $TORCHSIM_DUMP_PATH/<hash>/ and the change silently does not take. The new helper wipes the Inductor compile cache (.torchinductor/, pinned inside DUMP_PATH by extension_config.py) and the per-source hash dirs identified by the 11-char prefix from extension_codecache.hash_prefix. togsim_results/ run logs and any unrelated files under outputs/ are preserved. setup_worktree.sh's post-creation message now points at the helper so a newly created worktree knows what to do between iterations. docs/worktrees.md gets the matching "Iterating on codegen inside a worktree" section plus a diagnostic for the related "I forgot to source .envrc" gotcha (traceback path under the canonical worktree while editing elsewhere).
8c4256e to
ba01db3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two small worktree-flow papercuts uncovered while verifying #228:
setup_worktree.shnow symlinksTOGSim/build/bin/Simulatorfrom the worktree the script was invoked in into the newly created one. TOGSim is a standalone C++ binary whose source rarely changes alongside Python frontend work, so a freshfeature//bugfix/worktree no longer needs a ~10-minute rebuild just to run any test that goes through TOGSimulator. The link is resolved withreadlink -fso it points at the real binary, not a chain of worktree symlinks. If the source binary is absent the script falls back to a hint instead of silently producing a half-set-up worktree.scripts/clear_codegen_cache.sh(new) wipes the codegen cache between iterations onPyTorchSimFrontend/mlir/*or any code that affects emitted MLIR. It removes$TORCHSIM_DUMP_PATH/.torchinductor(Inductor's compile cache, pinned there byextension_config.py:139) and the per-source-hash dirs (<11-char-hash>/, the prefix fromextension_codecache.hash_prefix).togsim_results/run logs and unrelated files underoutputs/are left alone.docs/worktrees.mdgets two new sections (TOGSim sharing, codegen iteration) plus the "I forgot to source .envrc" diagnostic — if a traceback points at the canonical worktree path while you are editing in a different one, that is the signal.Motivation: while preparing the #228 fix I hit (a) "the fix did not take" because the previous broken compile was being replayed from
outputs/<hash>/and (b) "TOGSim binary not found" in the fresh worktree. Both were ~5-10 minutes of confusion. The CLAUDE.md gotcha in #230 covers the rule; this PR adds the helper script that implements it and removes the TOGSim setup step entirely.Test plan
bash -n scripts/setup_worktree.sh(syntax)setup_worktree.sh: creates worktree, symlinks TOGSim binary as a single hop pointing at the real path (verified withls -la), emits the "Symlinked TOGSim binary from ..." line and the cache-clear hint.scripts/clear_codegen_cache.shon a synthetic$TORCHSIM_DUMP_PATHcontaining one 11-char hash dir, one.torchinductordir, and one unrelated dir: removes the first two, preserves the third.tests/test_expert_mask.py(from [Frontend] Fix index/i64 type mismatch in expert-mask codegen (issue #228) #231) ran the full Gem5/Spike/TOGSim pipeline against the canonical TOGSim binary from a sibling worktree.🤖 Generated with Claude Code