Skip to content

refactor: regenerate baselines for the loader API refactor (q2mm#281 companion)#4

Merged
ericchansen merged 1 commit into
mainfrom
refactor/loader-api-baselines
May 26, 2026
Merged

refactor: regenerate baselines for the loader API refactor (q2mm#281 companion)#4
ericchansen merged 1 commit into
mainfrom
refactor/loader-api-baselines

Conversation

@ericchansen
Copy link
Copy Markdown
Owner

Companion to ericchansen/q2mm#281 (loader API refactor).

TL;DR

Regenerates every convergence baseline against the refactored q2mm loaders. The headline result is that the ratio gate now passes for four of five published-FF systems (was two). Two systems — pd-conjugate and rh-conjugate — were previously misclassified as gate-failures because the pre-refactor loader was silently overwriting their published Wahlers OPT values with raw QFUERZA projections.

What changed

System Ratio before Ratio after Gate after
ch3f 1.000 1.000
Rh-enamide 1.05 1.07
Pd-allyl 1.09 1.10
Heck relay 1.30 1.30
Pd 1,4-conj 1.20 0.96
Rh 1,4-conj ~4 × 10³ 1.04

Per-category R² of the published OPT values, evaluated by the q2mm JAX engine:

System R²(bond_len) R²(bond_ang) R²(eig_diag)
Rh-enamide 0.987 0.918 0.963
Heck relay 0.980 0.781 −12.6
Pd-allyl 0.042 0.330 −2.82
Pd 1,4-conj 0.939 −0.177 −10.06
Rh 1,4-conj 0.891 0.454 −7.86

Geometry reproduction is now strong (bond_length R² ≥ 0.89 for the published-OPT systems, except pd-allyl). The eigenmatrix R² is consistently negative — that is the real cross-engine MM3* ↔ JAX-engine gap, not a loader artifact.

What got removed

  • benchmarks/rh-enamide/convergence/rh-enamide_optimized.fld
  • benchmarks/pd-allyl-amination/convergence/pd-allyl_optimized.fld

Both .fld files were produced by the pre-refactor loader (which QFUERZA-overwrote the published OPT values) and are no longer reproducible against the current loader. Fresh _optimized.fld files for all four gate-passing systems will land in a follow-up PR that runs end-to-end optimization against the refactored loader (alongside q2mm#275).

Provenance

Every regenerated JSON carries the standard provenance block:

  • q2mm git SHA (from the refactor branch — will be updated to the merged SHA once q2mm#281 lands)
  • q2mm-data git SHA
  • JAX/OpenMM device names
  • ratio_tol setting
  • ISO-8601 timestamp

Companion PR

ericchansen/q2mm#281 — must merge before this PR's data fully reflects committed q2mm SHAs (the convergence JSONs reference the refactor branch's HEAD).

Companion to ericchansen/q2mm PR.  The loader API refactor (frozen-as-
invariant + qfuerza_fresh/qfuerza_into split + load_system dispatch
through q2mm/models/loaders.py) changes the contract every loader
exposes: published OPT values are now preserved as-published instead
of silently being overwritten by raw QFUERZA projections.

That means every convergence baseline in this repo needed to be
regenerated against the new loaders.  Headline result: the ratio gate
now passes for four of five systems (was two).

| System         | Ratio before | Ratio after | Gate after |
|----------------|-------------:|------------:|:----------:|
| ch3f           |        1.000 |       1.000 |     ✓      |
| Rh-enamide     |        1.05  |       1.07  |     ✓      |
| Pd-allyl       |        1.09  |       1.10  |     ✓      |
| Heck relay     |        1.30  |       1.30  |     ✗      |
| Pd 1,4-conj    |        1.20  |     **0.96**|   **✓**    |
| Rh 1,4-conj    |  ~4 × 10³    |     **1.04**|   **✓**    |

Pd 1,4-conj and Rh 1,4-conj were previously misclassified as
gate-failures because the QFUERZA overwrite was corrupting their
published Wahlers OPT values, sending JaxLoss's inner geometry
minimization into pathological regions.  The pre-refactor
rh-conjugate "non-determinism" tracked in q2mm#278 (ratios of 0.46 /
0.96 / ~4 × 10³ across sessions) is fully explained by this — the
overwrite was chaotic.

Per-category R² (published OPT values evaluated by the JAX engine,
no QFUERZA):

| System         | R²(bond_len) | R²(bond_ang) | R²(eig_diag) |
|----------------|:------------:|:------------:|:------------:|
| Rh-enamide     |  0.987       |  0.918       |  0.963       |
| Heck relay     |  0.980       |  0.781       | −12.6        |
| Pd-allyl       |  0.042       |  0.330       | −2.82        |
| Pd 1,4-conj    |  0.939       | −0.177       | −10.06       |
| Rh 1,4-conj    |  0.891       |  0.454       | −7.86        |

Removed:
  - benchmarks/rh-enamide/convergence/rh-enamide_optimized.fld
  - benchmarks/pd-allyl-amination/convergence/pd-allyl_optimized.fld

Both `.fld` files were produced by the pre-refactor loader (which
QFUERZA-overwrote the published OPT values) and are no longer
reproducible against the current loader.  Fresh `_optimized.fld`
files for all four gate-passing systems will land in a follow-up PR
that runs end-to-end optimization against the refactored loader
(tracked alongside ericchansen/q2mm#275).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@ericchansen ericchansen merged commit 65faf95 into main May 26, 2026
@ericchansen ericchansen deleted the refactor/loader-api-baselines branch May 26, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant