Skip to content

engineering: HINQSSQDConfig should auto-scale NQS architecture to system size (Chinchilla-style) #49

@thc1006

Description

@thc1006

TL;DR

Default embed_dim=64, n_heads=4, n_layers=4 (~14k params) is catastrophically undersized for systems ≥30Q. Historical job 163794 used 256/8/8 (~14.7M params, 1000× more) for 40Q to reach chem-acc. Default config should auto-scale per system size.

Empirical evidence (5+ data points)

System Default arch (now) Achieved E Required arch (history) Achieved E
BeH2 14Q 64/4/4 -15.595 (FCI) ✅ (same fine)
C2H2 24Q 64/4/4 ~chem-acc (history) (small fine)
N2 40Q 64/4/4 -109.207 (5 mHa off) 256/8/8 -109.215 (chem-acc)
N2 52Q 64/4/4 -109.245 (~25 mHa off) (256/8/8 untested by us) (predicted lower)
Cr2 24-52Q 64/4/4 (untested) (likely needs 256/8/8+)

Theoretical support: NQS scaling laws (5+ sources)

  1. LLM scaling laws for NQS (arXiv:2509.12679) — Chinchilla-like data/param relationship
  2. Solving many-electron Schrödinger eq with Transformer (Nature Comms 2025) — large arch needed for >30 spin orbitals
  3. QiankunNet (ACM HPC 2025) — 256/8/8+ for 120 spin orbitals
  4. Physics-informed Transformers (Nature Comms 2025)
  5. GTNN-SCI (JCTC 2026) — Transformer self-attention for long-range correlations

Proposal

def _auto_arch(n_qubits: int) -> tuple[int, int, int]:
    \"\"\"Returns (embed_dim, n_heads, n_layers) per Chinchilla-aligned scaling.\"\"\"
    if n_qubits <= 14:  return (64, 4, 4)    # ~14k params, 1k samples enough
    if n_qubits <= 30:  return (128, 6, 6)   # ~500k params
    if n_qubits <= 50:  return (256, 8, 8)   # ~14M params
    if n_qubits <= 80:  return (384, 8, 12)  # ~50M params
    return (512, 12, 16)                       # >80Q

Add to HINQSSQDConfig:

  • auto_arch: bool = True (auto-scale based on mol_info["n_qubits"])
  • Override available via explicit embed_dim/n_heads/n_layers

Sample size + iter should scale together (Chinchilla-style):

  • n_samples_per_iter ≈ 100 × n_arch_params^0.5
  • pt2_top_k ≈ n_samples / 5

Why removed/missing now

Defaults shipped optimized for H2/BeH2 smoke tests. No upscaling logic for production systems.

Impact

  • Without this: pipeline 010 systematically loses 5-30 mHa on 30Q+ systems (depending on size)
  • With this: closer to literature SOTA without user manually tuning

Effort: 2 days

  • Auto-scaling function
  • Migration of existing HINQSSQDConfig callers
  • Document Chinchilla rationale in docstring + ADR

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions