TL;DR
Default embed_dim=64, n_heads=4, n_layers=4 (~14k params) is catastrophically undersized for systems ≥30Q. Historical job 163794 used 256/8/8 (~14.7M params, 1000× more) for 40Q to reach chem-acc. Default config should auto-scale per system size.
Empirical evidence (5+ data points)
| System |
Default arch (now) |
Achieved E |
Required arch (history) |
Achieved E |
| BeH2 14Q |
64/4/4 |
-15.595 (FCI) ✅ |
(same fine) |
– |
| C2H2 24Q |
64/4/4 |
~chem-acc (history) |
(small fine) |
– |
| N2 40Q |
64/4/4 |
-109.207 (5 mHa off) |
256/8/8 |
-109.215 (chem-acc) |
| N2 52Q |
64/4/4 |
-109.245 (~25 mHa off) |
(256/8/8 untested by us) |
(predicted lower) |
| Cr2 24-52Q |
64/4/4 |
(untested) |
(likely needs 256/8/8+) |
– |
Theoretical support: NQS scaling laws (5+ sources)
- LLM scaling laws for NQS (arXiv:2509.12679) — Chinchilla-like data/param relationship
- Solving many-electron Schrödinger eq with Transformer (Nature Comms 2025) — large arch needed for >30 spin orbitals
- QiankunNet (ACM HPC 2025) — 256/8/8+ for 120 spin orbitals
- Physics-informed Transformers (Nature Comms 2025)
- GTNN-SCI (JCTC 2026) — Transformer self-attention for long-range correlations
Proposal
def _auto_arch(n_qubits: int) -> tuple[int, int, int]:
\"\"\"Returns (embed_dim, n_heads, n_layers) per Chinchilla-aligned scaling.\"\"\"
if n_qubits <= 14: return (64, 4, 4) # ~14k params, 1k samples enough
if n_qubits <= 30: return (128, 6, 6) # ~500k params
if n_qubits <= 50: return (256, 8, 8) # ~14M params
if n_qubits <= 80: return (384, 8, 12) # ~50M params
return (512, 12, 16) # >80Q
Add to HINQSSQDConfig:
auto_arch: bool = True (auto-scale based on mol_info["n_qubits"])
- Override available via explicit
embed_dim/n_heads/n_layers
Sample size + iter should scale together (Chinchilla-style):
n_samples_per_iter ≈ 100 × n_arch_params^0.5
pt2_top_k ≈ n_samples / 5
Why removed/missing now
Defaults shipped optimized for H2/BeH2 smoke tests. No upscaling logic for production systems.
Impact
- Without this: pipeline 010 systematically loses 5-30 mHa on 30Q+ systems (depending on size)
- With this: closer to literature SOTA without user manually tuning
Effort: 2 days
- Auto-scaling function
- Migration of existing
HINQSSQDConfig callers
- Document Chinchilla rationale in docstring + ADR
TL;DR
Default
embed_dim=64, n_heads=4, n_layers=4(~14k params) is catastrophically undersized for systems ≥30Q. Historical job 163794 used 256/8/8 (~14.7M params, 1000× more) for 40Q to reach chem-acc. Default config should auto-scale per system size.Empirical evidence (5+ data points)
Theoretical support: NQS scaling laws (5+ sources)
Proposal
Add to
HINQSSQDConfig:auto_arch: bool = True(auto-scale based onmol_info["n_qubits"])embed_dim/n_heads/n_layersSample size + iter should scale together (Chinchilla-style):
n_samples_per_iter≈ 100 × n_arch_params^0.5pt2_top_k≈ n_samples / 5Why removed/missing now
Defaults shipped optimized for H2/BeH2 smoke tests. No upscaling logic for production systems.
Impact
Effort: 2 days
HINQSSQDConfigcallers