Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
cb4eddd
docs: add ADR-005 PT2 configuration selection for HI-NQS v3
thc1006 Apr 1, 2026
6581325
feat: add PT2 config fields + standalone helpers (P0+P1, ADR-005)
thc1006 Apr 1, 2026
4c87876
feat: 3-term NQS teacher loss (teacher + energy + entropy) (P2, ADR-005)
thc1006 Apr 1, 2026
4dbb8db
feat: wire PT2 selection + eviction + annealing into run_hi_nqs_sqd (…
thc1006 Apr 1, 2026
68a2c05
feat: add sparse diag fallback to CIPSI solver (P4, ADR-005)
thc1006 Apr 1, 2026
de5041c
fix: negative-stride array bug + AGENTS/CHANGELOG update (P5, ADR-005)
thc1006 Apr 1, 2026
febb247
fix: address 9 Copilot review items on PR #31
thc1006 Apr 1, 2026
461a3df
fix: support 64+ qubit hashes in PT2 scoring
thc1006 Apr 1, 2026
6b502bb
fix: Copilot round 2 + 64-72Q registry entries
thc1006 Apr 1, 2026
e0423b3
fix: docstring defaults 0.1→0.0, remove dead test code (Copilot PR #31)
thc1006 Apr 1, 2026
df38dd1
fix: eviction uses full-basis diag (ASCI pattern) + CIPSI spy test
thc1006 Apr 1, 2026
5bde8f9
fix: device mismatch, monkeypatch fixture, sparse vs dense test (Copi…
thc1006 Apr 1, 2026
614b714
fix: CUDA tensor in CIPSI dense path + eviction PT2 state consistency
thc1006 Apr 1, 2026
e20a7a8
fix: add torch.manual_seed for deterministic PT2 temperature test
thc1006 Apr 1, 2026
88e8541
fix: IBM solve_fermion path — marginals teacher + nuclear repulsion
thc1006 Apr 2, 2026
4744640
feat: auto-enable IBM solve_fermion when qiskit_addon_sqd available
thc1006 Apr 2, 2026
f62bc29
fix: skip S-CORE for NQS samples + cleanup IBM solver path
thc1006 Apr 2, 2026
e09a398
fix: address Copilot review + update docs for 40Q benchmark
thc1006 Apr 2, 2026
ded1e88
docs: update AGENTS.md CI/CD section for PR #34 changes
thc1006 Apr 2, 2026
3e52c72
docs: add SCI validation results to ADR-005
thc1006 Apr 2, 2026
fe0e9ce
docs: fix advantage docstring to match implementation
thc1006 Apr 2, 2026
c961e0b
fix: address 3 code review findings
thc1006 Apr 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 22 additions & 8 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ qvartools/
│ │ └── lucj_sampler.py # LUCJSampler (Qiskit + ffsim LUCJ circuit)
│ │
│ ├── molecules/ # Molecular system registry
│ │ └── registry.py # MOLECULE_REGISTRY (24 molecules: 12 full-space + 12 CAS), get_molecule, list_molecules
│ │ └── registry.py # MOLECULE_REGISTRY (26 molecules: 12 full-space + 14 CAS), get_molecule, list_molecules
│ │
│ ├── _ext/ # Experimental GPU extensions
│ │ ├── __init__.py
Expand All @@ -268,8 +268,9 @@ qvartools/
│ │ └── nqs/
│ │ ├── nqs_sqd.py # NQSSQDConfig, run_nqs_sqd
│ │ ├── nqs_skqd.py # NQSSKQDConfig, run_nqs_skqd
│ │ ├── hi_nqs_sqd.py # HINQSSQDConfig, run_hi_nqs_sqd (initial_basis warm-start)
│ │ └── hi_nqs_skqd.py # HINQSSKQDConfig, run_hi_nqs_skqd (initial_basis warm-start)
│ │ ├── hi_nqs_sqd.py # HINQSSQDConfig, run_hi_nqs_sqd (initial_basis, PT2 selection)
│ │ ├── hi_nqs_skqd.py # HINQSSKQDConfig, run_hi_nqs_skqd (initial_basis warm-start)
│ │ └── _pt2_helpers.py # compute_pt2_scores, evict_by_coefficient, compute_temperature
│ │
│ └── _utils/ # Internal utilities
│ ├── scaling/
Expand Down Expand Up @@ -454,7 +455,7 @@ Stage 1: Train Flow + NQS Stage 2: Basis Selection Stage 3: Sub

## 4. Molecule Registry

24 pre-configured molecular benchmarks (12 full-space + 12 CAS active-space) accessible via `get_molecule(name)`:
26 pre-configured molecular benchmarks (12 full-space + 14 CAS active-space) accessible via `get_molecule(name)`:

**Full-space molecules (4--28 qubits)**

Expand All @@ -473,7 +474,7 @@ Stage 1: Train Flow + NQS Stage 2: Basis Selection Stage 3: Sub
| H2S | 26 | sto-3g | bent |
| C2H4 | 28 | sto-3g | planar |

**CAS active-space molecules (24--58 qubits)**
**CAS active-space molecules (24--72 qubits)**

| Name | Qubits | Basis Set | Active Space |
|------|--------|-----------|--------------|
Expand All @@ -489,6 +490,8 @@ Stage 1: Train Flow + NQS Stage 2: Basis Selection Stage 3: Sub
| Cr2-CAS(12,26) | 52 | cc-pvdz | 12e, 26o |
| Cr2-CAS(12,28) | 56 | cc-pvdz | 12e, 28o |
| Cr2-CAS(12,29) | 58 | cc-pvdz | 12e, 29o |
| Cr2-CAS(12,32) | 64 | cc-pvdz | 12e, 32o |
| Cr2-CAS(12,36) | 72 | cc-pvdz | 12e, 36o |

---

Expand Down Expand Up @@ -763,16 +766,27 @@ The `_ext/` subpackage is **experimental and optional**. `sbd_subprocess` requir

When `SQDConfig.use_cartesian_product=True` (default), SQD splits sampled configs into alpha/beta spin strings via `split_spin_strings()`, then enumerates all alpha×beta pairs via `cartesian_product_configs()`. This dramatically improves basis coverage for molecular Hamiltonians.

### IBM `solve_fermion` Energy Convention

IBM's `qiskit_addon_sqd.fermion.solve_fermion` returns **electronic energy only** (no nuclear repulsion). Always add `hamiltonian.integrals.nuclear_repulsion` to the result. Its `sci_state.amplitudes` is **2D** (n_alpha_strs × n_beta_strs), not 1D — use α/β marginals for NQS teacher weights.

### S-CORE is for Quantum Hardware Only

`recover_configurations` (S-CORE) in `qiskit_addon_sqd` is a noise-recovery technique for noisy quantum hardware samples. **Do not use it for classical NQS samples** — it adds massive overhead (NH₃: 1.5 hr → 5 s without it) with no accuracy benefit on clean samples.

---

## 11. CI/CD

### GitHub Actions

**CI Pipeline** (`.github/workflows/ci.yml`):
- **Lint job:** `ruff format --check` + `ruff check` on Python 3.11
- **Test job:** `pytest` on Python 3.10, 3.11, 3.12 with `[dev,pyscf]` extras
- Excludes `gpu` marker tests
- **Lint job:** `ruff format --check` + `ruff check` on Python 3.11, pip cached
- **Typecheck job:** `mypy` on core modules (informational)
- **Smoke job:** Verifies 26+ molecules registered + all public modules importable
- **Test job:** `pytest` on Python 3.10, 3.11, 3.12 with `[dev,pyscf]` extras; coverage only on 3.11 (`--cov-fail-under=40`); excludes `gpu` marker
- **Docs job:** Sphinx build check on PRs (warns but doesn't block)
- **Global:** `concurrency: cancel-in-progress` cancels superseded runs; `fail-fast: false`

**Docs Pipeline** (`.github/workflows/docs.yml`):
- Sphinx build on push to main
Expand Down
13 changes: 12 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added
- `compute_molecular_integrals` now accepts `cas` and `casci` parameters for CAS active-space reduction
- 12 new CAS molecules in registry: N₂-CAS(10,12/15/17/20/26), Cr₂ + variants, Benzene CAS(6,15)
- 14 new CAS molecules in registry (26 total): N₂-CAS(10,12/15/17/20/26), Cr₂ + variants up to 72Q, Benzene CAS(6,15)
- IBM `solve_fermion` auto-enabled when `qiskit_addon_sqd` is installed (α×β Cartesian product, dramatically better accuracy)
- `_train_nqs_teacher` raises `ValueError` when `energy_weight > 0` without `hamiltonian`
- `_compute_cas_integrals` helper with auto-CASCI fallback for large active spaces
- `MolecularHamiltonian.build_sparse_hamiltonian()` for O(nnz) sparse H construction
- Sparse eigenvalue dispatch in `gpu_solve_fermion` for basis > 8K configs
- CAS-aware `FCISolver` using active-space integrals directly (no full molecule rebuild)
- FCI-free pipeline support: 25 experiment scripts gracefully handle `exact_energy=None`
- PT2 configuration selection for HI+NQS+SQD (`use_pt2_selection=True`, ADR-005)
- `_pt2_helpers.py`: EN-PT2 scoring, ASCI coefficient eviction, temperature annealing
- 3-term NQS teacher loss (teacher KL + energy REINFORCE + entropy)
- CIPSI sparse fallback for basis > 10K via `build_sparse_hamiltonian`
- `TransformerAsNQS` adapter: enables `AutoregressiveTransformer` in NF training pipeline
- `NQSWithSampling` adapter: enables any `NeuralQuantumState` in HI training pipeline
- `qvartools._logging` module with `configure_logging()` and `get_logger()`
Expand All @@ -38,10 +44,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- ADR-002 decision record (deferred: torch/numpy roundtrip not a bottleneck)
- ADR-003 decision record (GPU-native SBD integration via r-ccs-cms/sbd)

### Removed
- S-CORE (`recover_configurations`) from HI-NQS-SQD IBM path — designed for quantum hardware noise, not needed for classical NQS samples (NH₃ 1.5 hr → 5 s)

### Fixed
- `TransformerNFSampler._build_nqs()` used wrong parameter name `hidden_dim` instead of `hidden_dims`
- `hi_nqs_sqd.py` passed tensors instead of numpy arrays to `vectorized_dedup`
- Groups 07/08 pipelines discarded NF+DCI basis when calling iterative NQS solvers (Issue #10)
- IBM `solve_fermion` returns electronic energy only; now correctly adds `nuclear_repulsion`
- CIPSI sparse path: `h_matrix.detach().cpu().numpy()` instead of `np.asarray` for CUDA tensors

## [0.0.0] - 2026-03-26

Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ qvartools consolidates normalizing-flow-guided neural quantum states (NF-NQS), s
- **Unified solver interface** covering FCI, CCSD, SQD, SKQD, and iterative NF variants -- all returning a common `SolverResult`
- **Automatic system-size scaling** that adapts network architectures and sampling budgets to the Hilbert-space dimension
- **YAML-based experiment configuration** with CLI overrides for reproducible experiments
- **Molecule registry** with pre-configured benchmarks from H2 (4 qubits) to C2H4 (28 qubits)
- **Molecule registry** with 26 pre-configured benchmarks from H₂ (4 qubits) to Cr₂-CAS(12,36) (72 qubits)

## Installation

Expand Down Expand Up @@ -169,6 +169,8 @@ Each subpackage is self-contained with a clean public API. Lower-level modules h
| Cr2-CAS(12,26) | 52 | cc-pvdz | 12e, 26 orb |
| Cr2-CAS(12,28) | 56 | cc-pvdz | 12e, 28 orb |
| Cr2-CAS(12,29) | 58 | cc-pvdz | 12e, 29 orb |
| Cr2-CAS(12,32) | 64 | cc-pvdz | 12e, 32 orb |
| Cr2-CAS(12,36) | 72 | cc-pvdz | 12e, 36 orb |

## Documentation

Expand Down
16 changes: 15 additions & 1 deletion docs/api_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -762,7 +762,21 @@ Fast integer hash for a single configuration tensor.

### `run_hi_nqs_sqd(hamiltonian, mol_info, config=None, *, initial_basis=None)`

Iterative HI+NQS+SQD pipeline with self-consistent eigenvector feedback. Config: `HINQSSQDConfig`. The `initial_basis` kwarg accepts a `torch.Tensor` of shape `(n_configs, n_qubits)` to warm-start the cumulative basis.
Iterative HI+NQS+SQD pipeline with self-consistent eigenvector feedback. Config: `HINQSSQDConfig`. The `initial_basis` kwarg accepts a `torch.Tensor` of shape `(n_configs, n_qubits)` to warm-start the cumulative basis. Auto-enables IBM `solve_fermion` (α×β Cartesian product) when `qiskit_addon_sqd` is installed.

### `_pt2_helpers` (Internal PT2 Selection Helpers)

#### `compute_pt2_scores(candidates, basis, coeffs, hamiltonian, e0) -> np.ndarray`

Score candidate configs by Epstein-Nesbet PT2 importance: `score(x) = |⟨x|H|Φ₀⟩|² / |E₀ - H_xx|`. Returns non-negative scores, shape `(n_cand,)`.

#### `evict_by_coefficient(basis, coeffs, max_size) -> tuple[Tensor, ndarray]`

Keep only the highest-|c_i|² configs (ASCI pattern). Returns trimmed basis and coefficients.

#### `compute_temperature(iteration, max_iterations, t_init, t_final) -> float`

Linear temperature annealing from `t_init` to `t_final` over iterations.

### `run_hi_nqs_skqd(hamiltonian, mol_info, config=None, *, initial_basis=None)`

Expand Down
226 changes: 226 additions & 0 deletions docs/decisions/005-pt2-selection-hi-nqs-v3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
# ADR-005: PT2 Configuration Selection for HI-NQS v3

- **Status**: Proposed
- **Date**: 2026-04-02
- **Author**: George Chang, Jen-Yu Chang
- **Relates to**: Issue #25 (adaptive sampling RFC), PR #30 (original proposal)

---

## Context

The current `run_hi_nqs_sqd` adds ALL unique NQS samples to the
cumulative basis each iteration, relying on random sampling to find
important configurations. At 40+ qubits, NQS sampling covers < 0.01%
of the Hilbert space, and most samples are uninformative.

PR #30 (leo07010) proposed adding PT2-based perturbative selection to
filter NQS samples before adding them to the basis. The algorithmic
concept is sound but the implementation has critical issues: 3 API
crash bugs, hard-imported optional dependencies, deleted backward
compatibility (initial_basis, CAS FCI, logging), and a mean-field
approximation in the teacher signal that loses correlation information.

This ADR documents the design decisions for a correct reimplementation.

---

## Decisions

### D1: PT2 scoring formula — Epstein-Nesbet

**Options:** Epstein-Nesbet (EN), Møller-Plesset (MP), Heat-Bath CI (HCI)

**Choice: Epstein-Nesbet**

```
score(x) = |⟨x|H|Φ₀⟩|² / |E₀ - H_xx|
```

- EN uses the actual diagonal element `H_xx`, which naturally captures
correlation effects in the denominator
- MP uses orbital energy sums, which requires a Fock operator (not
always available in our framework)
- HCI uses `max_i |H_{xi} c_i|` without a denominator — simpler but
gives no PT2 energy correction estimate
- EN is the standard in CIPSI/Quantum Package and our existing
`SelectedCIExpander`

**Source:** Quantum Package docs, QMCPACK Selected CI docs, Holmes et
al. JCTC 2016.

### D2: NQS teacher signal — full |c_x|² joint distribution

**Options:** Full `|c_x|²`, α/β marginal product, uniform

**Choice: Full |c_x|²**

PR #30 used `alpha_marginal[a] × beta_marginal[b]` as teacher weights.
This is a mean-field approximation that loses alpha-beta correlation —
for strongly correlated molecules (Cr₂, bond-breaking), the joint
distribution `|c_{ab}|²` has off-diagonal structure that the product
approximation misses entirely.

The original `_train_nqs_teacher` used full `|c_x|²`, which is correct.
We preserve this approach.

**Source:** Lanczos-NQS paper (arXiv:2502.01264), Thompson & Gunlycke
(arXiv:2603.24728).

### D3: Basis eviction — coefficient-based (ASCI pattern)

**Options:** PT2 score from insertion time, |c_i|² after diag, random

**Choice: |c_i|² after each diagonalisation**

PR #30 stored PT2 scores from the iteration when each config was added
and used these for eviction. This is methodologically flawed: scores
from different iterations use different eigenvectors, making cross-
iteration comparison meaningless.

The Adaptive Sampling CI (ASCI) method by Tubman et al. uses the
correct approach: after each diagonalisation, keep the configs with
largest `|c_i|²` (CI coefficient magnitude). This naturally discards
configs that the eigenvector no longer considers important.

**Source:** Tubman et al. JCTC 2020, Quantum Package CIPSI truncation.

### D4: Diag backend — gpu_solve_fermion (preserve optional dep guard)

**Options:** `qiskit_addon_sqd.solve_fermion` (hard import), `gpu_solve_fermion` (guarded)

**Choice: gpu_solve_fermion**

PR #30 hard-imported `solve_fermion` from `qiskit_addon_sqd`, which
broke CI (the package is optional). We preserve the existing pattern:
`gpu_solve_fermion` with the try/except guard for `qiskit_addon_sqd`.

### D5: Backward compatibility — extend, don't replace

PR #30 deleted `initial_basis` (PR #20), CAS FCI support (PR #24),
logging, `__all__`, frozen dataclass, and NumPy-style docstrings.

**Choice:** Preserve ALL existing API. Add PT2 selection as new
parameters on the existing `HINQSSQDConfig`:

```python
@dataclass(frozen=True)
class HINQSSQDConfig:
# ... existing fields preserved ...
# New PT2 selection fields
use_pt2_selection: bool = False # opt-in, backward compatible
pt2_top_k: int = 2000 # configs kept per iteration
max_basis_size: int = 10_000 # eviction threshold
convergence_window: int = 3 # consecutive converged iters
initial_temperature: float = 1.0 # annealing start
final_temperature: float = 0.3 # annealing end
```

When `use_pt2_selection=False` (default), behavior is identical to
current code. This ensures all existing tests pass unchanged.

---

## Implementation Plan (TDD)

### P0: Config fields (backward compatible)

Add new fields to frozen `HINQSSQDConfig` with defaults that preserve
existing behavior (`use_pt2_selection=False`). Also add 3-term loss
weights (`teacher_weight`, `energy_weight`, `entropy_weight`).

### P1: Standalone helpers in `methods/nqs/_pt2_helpers.py`

Three pure functions (no NQS dependency, independently testable):

1. `compute_pt2_scores(candidates, basis_coeffs, hamiltonian, e0)` —
EN-PT2 scoring via `get_connections` (NOT `get_connections_vectorized_batch`).
Uses existing `bitstring_format` utilities (NOT local reimplementation).
2. `evict_by_coefficient(basis, coeffs, max_size)` — keep highest |c_i|²
(ASCI pattern).
3. `compute_temperature(iteration, max_iter, t_init, t_final)` — linear
interpolation.

### P2: Enhance `_train_nqs_teacher` with 3-term loss

Add energy (REINFORCE with diagonal advantage) and entropy terms.
Use full `|c_x|²` teacher (NOT α/β marginal product — loses correlation).
Correctly call `nqs.log_prob(alpha, beta)` with 2 args (split at n_orb).
Keep original behavior when `energy_weight=0` and `entropy_weight=0`.

### P3: Integration into `run_hi_nqs_sqd`

Gate on `use_pt2_selection`:
- `True`: PT2 filter → eviction → temperature anneal → convergence window
- `False`: zero change to existing behavior

Preserve: `initial_basis`, CAS compat, logging, `__all__`, docstrings.

### P4: CIPSI sparse fallback (independent)

When `n_basis > 10K`, use `hamiltonian.build_sparse_hamiltonian(basis)` +
`scipy.sparse.linalg.eigsh`. Uses OUR API (NOT PR #30's nonexistent
`get_sparse_matrix_elements`).

### Dependency graph

```
P0 (config) ──┬── P1 (helpers) ──┐
├── P2 (3-term loss) ├── P3 (integration) ── P5 (review) ── P6 (PR)
└── P4 (CIPSI sparse, independent) ──────────┘
```

P1, P2, P4 are independent after P0. P3 depends on P1 + P2.

### Scope: SQD only, SKQD deferred

PT2 selection is added to `run_hi_nqs_sqd` only. `run_hi_nqs_skqd`
already has Krylov expansion which serves a similar basis-enrichment
role. Extending PT2 to SKQD is a future enhancement.

---

## Consequences

### Positive

- More accurate basis selection at scale (PT2 > random sampling)
- Basis eviction prevents unbounded memory growth
- Temperature annealing improves exploration→exploitation transition
- Fully backward compatible (opt-in via `use_pt2_selection`)

### Negative

- PT2 scoring adds O(n_candidates × n_connections) per iteration
- Eviction adds one eigenvector sort per iteration (negligible)
- More config parameters to tune

### Risks

- PT2 scoring is CPU-bound Python (get_connections loop) — may be slow at 40Q
- Coefficient-based eviction may discard configs that become important later
- Temperature annealing schedule may need per-system tuning

### Validation Results (2026-04-02)

HI-NQS IBM (5K samples/iter) vs SCI (CIPSI, natural convergence, Numba):

| System | HI-NQS Energy | SCI Energy | Diff | HI-NQS Time | SCI Time |
|--------|--------------|------------|------|-------------|----------|
| C2H2 24Q | **-76.02457** | -76.02453 | HI-NQS wins 0.46 mHa | **456s** | 1,088s |
| N2 40Q | -109.1844 | **-109.2132** | SCI wins 28.8 mHa | **20 min** | 3h45m |

Conclusion: HI-NQS exceeds SCI at 24Q; at 40Q, systematic H-connection expansion
(Issue #35 Tier 1) is needed to close the 28.8 mHa gap.

---

## References

- PR #30 (leo07010): original HI-NQS v3 proposal
- Quantum Package CIPSI: EN-PT2 standard
- Holmes et al. JCTC 2016: Heat-Bath CI comparison
- Tubman et al. JCTC 2020: ASCI coefficient-based selection
- arXiv:2503.06292: HI-VQE iteration strategy
- arXiv:2603.24728: Auto-regressive NQS for Selected CI
- arXiv:2502.01264: Lanczos-NQS (KL vs MSE for teacher)
Loading
Loading