chore(benchmarks): drop orphaned pre-loader-refactor results/forcefields (~14 MB)#7
Merged
Merged
Conversation
…ent recurrence
Removes 5 systems' results/ and forcefields/ directories that were
generated speculatively by the old "full matrix" benchmark CLI but
never wired into any documentation or test. Total: 100 files,
~14 MB. Removed paths:
benchmarks/rh-enamide/{results,forcefields}/
benchmarks/pd-allyl-amination/{results,forcefields}/
benchmarks/rh-1,4-conjugate-addition/{results,forcefields}/
benchmarks/heck-relay/{results,forcefields}/
benchmarks/pd-1,4-conjugate-addition/{results,forcefields}/
Kept (still actively referenced):
- benchmarks/*/convergence/ — current pipeline, all per-system doc pages
- benchmarks/ch3f/{results,forcefields}/ — small-molecules.md matrix table
- benchmarks/heck-relay/diagnostic/ — heck-relay.md three-baseline diagnostic
- qfuerza-zenodo/ — qfuerza-validation.md
Prevention — without these the same mistake repeats:
1. README rewrite
The previous README documented every system as having
results/forcefields/ directories — i.e. it codified the orphan
layout. Future contributors reading it would think speculative
per-system results/forcefields/ were the standard. Updated to:
- Reflect actual current layout (only ch3f has the matrix output;
everything else uses convergence/, plus heck-relay has
diagnostic/)
- Document the two standard directory shapes (convergence/ vs the
legacy matrix layout) and explicitly note that the legacy
layout should NOT be added for new systems without a paired doc
PR in q2mm.
- Add a "Stewardship rule" section that cites
ericchansen/q2mm AGENTS.md §2 and codifies the three
committer-facing rules: (a) trace every commit to a doc/test
reference, (b) paired-PRs when deleting a producing code path,
(c) no speculative or exploratory output here.
2. scripts/audit-orphans.sh
Catches the issue mechanically rather than relying on humans
reading the README. Walks benchmarks/<system>/<subdir>/, greps
a sibling q2mm checkout (docs/ test/ q2mm/ scripts/ examples/)
for the path, and reports any subdirectory with zero references.
Exits 1 on orphans found, 0 otherwise — suitable for periodic
manual runs or future CI wire-up.
Verified after the deletion: ✅ No orphaned directories.
History note: the orphaned data was committed in 5d210c2 ("feat:
initial population") and never updated since. It pre-dates the
loader API refactor (ericchansen/q2mm#281), so the optimization
trajectories it captured are no longer reproducible from current
q2mm (pre-refactor loader was overwriting published OPT values with
raw QFUERZA projections). If anyone ever needs them back they're
in git history at 5d210c2.
Per ericchansen/q2mm AGENTS.md §2 "Every file earns its place" +
"No deprecated artifacts".
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5eb3665 to
2998036
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two parts:
Removes ~14 MB of orphaned data — 5 systems'
results/andforcefields/directories that were generated speculatively by the old "full matrix" benchmark CLI but never wired into any documentation or test.Prevents recurrence — without these changes the same mistake would just happen again. README rewrite + automated audit script.
What gets deleted (100 files)
benchmarks/rh-enamide/{results,forcefields}/benchmarks/pd-allyl-amination/{results,forcefields}/benchmarks/rh-1,4-conjugate-addition/{results,forcefields}/benchmarks/heck-relay/{results,forcefields}/benchmarks/pd-1,4-conjugate-addition/{results,forcefields}/Prevention
README rewrite
The old README codified the orphan layout — it documented every system as having
results/forcefields/directories, making them look canonical. New version:ch3fhas the matrix output; everything else usesconvergence/, plusheck-relayhasdiagnostic/)convergence/vs the legacy matrix layout) and explicitly notes the legacy layout should not be added for new systems without a paired doc PR in q2mmscripts/audit-orphans.sh
Catches the issue mechanically rather than relying on humans reading the README:
Walks every
benchmarks/<system>/<subdir>/, greps a sibling q2mm checkout (docs/ test/ q2mm/ scripts/ examples/) for the path, and reports any subdirectory with zero references. Exits 1 on orphans found, 0 otherwise — suitable for periodic manual runs or future CI wire-up.Verified post-deletion:
What stays
benchmarks/*/convergence/benchmarks/ch3f/{results,forcefields}/docs/systems/small-molecules.mdmatrix tablebenchmarks/heck-relay/diagnostic/docs/systems/heck-relay.mdqfuerza-zenodo/docs/benchmarks/qfuerza-validation.mdRisk
Zero — no code references the deleted files; they're not in any test fixture; the regen script writes to fresh paths anyway. Git history preserves them at
5d210c2if ever needed.