engineering: HINQSSQDConfig should auto-scale NQS architecture to system size (Chinchilla-style)

## TL;DR
Default `embed_dim=64, n_heads=4, n_layers=4` (~14k params) is **catastrophically undersized** for systems ≥30Q. Historical job 163794 used 256/8/8 (~14.7M params, 1000× more) for 40Q to reach chem-acc. **Default config should auto-scale per system size**.

## Empirical evidence (5+ data points)
| System | Default arch (now) | Achieved E | Required arch (history) | Achieved E |
|---|---|---|---|---|
| BeH2 14Q | 64/4/4 | -15.595 (FCI) ✅ | (same fine) | – |
| C2H2 24Q | 64/4/4 | ~chem-acc (history) | (small fine) | – |
| **N2 40Q** | 64/4/4 | **-109.207** (5 mHa off) | **256/8/8** | **-109.215** (chem-acc) |
| **N2 52Q** | 64/4/4 | **-109.245** (~25 mHa off) | (256/8/8 untested by us) | (predicted lower) |
| Cr2 24-52Q | 64/4/4 | (untested) | (likely needs 256/8/8+) | – |

## Theoretical support: NQS scaling laws (5+ sources)
1. [LLM scaling laws for NQS (arXiv:2509.12679)](https://arxiv.org/abs/2509.12679) — Chinchilla-like data/param relationship
2. [Solving many-electron Schrödinger eq with Transformer (Nature Comms 2025)](https://www.nature.com/articles/s41467-025-63219-2) — large arch needed for >30 spin orbitals
3. [QiankunNet (ACM HPC 2025)](https://dl.acm.org/doi/10.1145/3581784.3607061) — 256/8/8+ for 120 spin orbitals
4. [Physics-informed Transformers (Nature Comms 2025)](https://www.nature.com/articles/s41467-025-66844-z)
5. [GTNN-SCI (JCTC 2026)](https://pubs.acs.org/doi/10.1021/acs.jctc.5c01429) — Transformer self-attention for long-range correlations

## Proposal
```python
def _auto_arch(n_qubits: int) -> tuple[int, int, int]:
    \"\"\"Returns (embed_dim, n_heads, n_layers) per Chinchilla-aligned scaling.\"\"\"
    if n_qubits <= 14:  return (64, 4, 4)    # ~14k params, 1k samples enough
    if n_qubits <= 30:  return (128, 6, 6)   # ~500k params
    if n_qubits <= 50:  return (256, 8, 8)   # ~14M params
    if n_qubits <= 80:  return (384, 8, 12)  # ~50M params
    return (512, 12, 16)                       # >80Q
```

Add to `HINQSSQDConfig`:
- `auto_arch: bool = True` (auto-scale based on `mol_info["n_qubits"]`)
- Override available via explicit `embed_dim/n_heads/n_layers`

Sample size + iter should scale together (Chinchilla-style):
- `n_samples_per_iter` ≈ 100 × n_arch_params^0.5
- `pt2_top_k` ≈ n_samples / 5

## Why removed/missing now
Defaults shipped optimized for H2/BeH2 smoke tests. No upscaling logic for production systems.

## Impact
- Without this: pipeline 010 systematically loses 5-30 mHa on 30Q+ systems (depending on size)
- With this: closer to literature SOTA without user manually tuning

## Effort: 2 days
- Auto-scaling function
- Migration of existing `HINQSSQDConfig` callers
- Document Chinchilla rationale in docstring + ADR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

engineering: HINQSSQDConfig should auto-scale NQS architecture to system size (Chinchilla-style) #49

TL;DR

Empirical evidence (5+ data points)

Theoretical support: NQS scaling laws (5+ sources)

Proposal

Why removed/missing now

Impact

Effort: 2 days

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

System	Default arch (now)	Achieved E	Required arch (history)	Achieved E
BeH2 14Q	64/4/4	-15.595 (FCI) ✅	(same fine)	–
C2H2 24Q	64/4/4	~chem-acc (history)	(small fine)	–
N2 40Q	64/4/4	-109.207 (5 mHa off)	256/8/8	-109.215 (chem-acc)
N2 52Q	64/4/4	-109.245 (~25 mHa off)	(256/8/8 untested by us)	(predicted lower)
Cr2 24-52Q	64/4/4	(untested)	(likely needs 256/8/8+)	–

engineering: HINQSSQDConfig should auto-scale NQS architecture to system size (Chinchilla-style) #49

Description

TL;DR

Empirical evidence (5+ data points)

Theoretical support: NQS scaling laws (5+ sources)

Proposal

Why removed/missing now

Impact

Effort: 2 days

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions