Skip to content

research: H-connection augmentation as S-CORE-(B) replacement (verified +4-8 mHa on 40Q/52Q) #45

@thc1006

Description

@thc1006

Background

Commit `f62bc29` (2026-04-02) removed `recover_configurations` (S-CORE) from `run_hi_nqs_sqd` IBM solver path with reasoning:

Remove recover_configurations (S-CORE) from IBM path — designed for noisy quantum hardware, not clean NQS samples.

The commit verified: H2O FCI exact, NH3 FCI exact, C2H2 0.004 mHa, but explicitly noted "N₂ 40Q 13.1 mHa" regression. The reason: S-CORE was doing TWO things, only one of which is hardware-noise-specific:

  1. (A) Bit-flip recovery for non-PC quantum samples — irrelevant for classical NQS (correctly removed)
  2. (B) Occupancy-guided basis augmentation — adds HF-adjacent configs based on prior occupancy distribution. Has value regardless of noise level. Accidentally removed too.

Empirical confirmation (2026-04-30 to 2026-05-01)

Added `connection_augment_max_new` config field (S-CORE-B replacement using `expand_basis_via_connections`):

  • Each iter, after PT2 filter, expand `unique_new` by Slater-Condon H-connections (singles+doubles), rank by `|H_ij|`, keep top max_new

Results vs pure no-augment baseline:

System No-augment baseline With H-connection augment Δ
40Q (N2-CAS(10,20)) -109.20224 Ha (BASELINE) -109.20658 Ha (AUGMENT, max_new=500) +4.3 mHa
52Q (N2-CAS(10,26)) -109.23563 Ha (BASELINE, iter 11 best @ TIMEOUT) -109.24300 Ha (AUGMENT, max_new=1000, iter 13) +7.4 mHa

40Q reference (historical with full S-CORE): -109.20979 (0.544 mHa from FCI)

  • Without S-CORE: -109.202 (8 mHa) → "13 mHa regression" per commit msg
  • With H-connection augment: -109.207 (3.2 mHa from FCI) — closes 53% of S-CORE removal gap

Status of work

Implemented as opt-in flag (default 0 = disabled, preserves f62bc29 behavior):

  • `HINQSSQDConfig.connection_augment_max_new: int = 0` (in `hi_nqs_sqd.py`)
  • Wired into iter loop (line ~485, between PT2 filter and cumulative_basis cat)
  • Logs as `ITER_METRICS_SAMPLE.n_connection_augment_added` and `connection-augment` info logs

Currently exposed via `experiments/pipelines/diagnostic/nqs_vs_random.py --augment-connections N`.

TODO for productionization

  1. Integrate into pipeline 010 production config as default behavior
  2. Tune max_new automatically based on system size (40Q ≈ 500, 52Q ≈ 1000, larger systems scale up)
  3. Compare with full SQD results across CAS(10,20) → CAS(10,32) to characterize gap-closure curve
  4. Benchmark vs reintroduced S-CORE (modified to skip bit-flip recovery, only do occupancy-guided augmentation) — see if our cheaper proxy matches
  5. Update ADR-005 (or new ADR) documenting the S-CORE-(A)/(B) split rationale
  6. Cite in user guide as recommended setting for >24Q systems

Out of scope

  • Reintroducing full S-CORE (different issue if needed)
  • Higher-order excitations (triples) — would extend augment beyond S+D

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions