bug: 07_iterative_nqs_dci/iter_nqs_dci_{sqd,krylov_classical}.py fail with initial_basis dtype ValueError

## Summary

Two out of three pipeline scripts in `experiments/pipelines/07_iterative_nqs_dci/` fail end-to-end with a `ValueError` at the `run_hi_nqs_sqd` / `run_hi_nqs_skqd` call site, on any molecule (including the smallest, H2). The bug has existed on `main` for a while; nothing appears to exercise these scripts in CI.

## Affected scripts

- `experiments/pipelines/07_iterative_nqs_dci/iter_nqs_dci_sqd.py`
- `experiments/pipelines/07_iterative_nqs_dci/iter_nqs_dci_krylov_classical.py`

The 3rd script in the group (`iter_nqs_dci_krylov_quantum.py`) does **not** use `initial_basis` and is unaffected.

## Reproduction

On current `main`:

```bash
python experiments/pipelines/07_iterative_nqs_dci/iter_nqs_dci_sqd.py h2 --device cpu
```

Expected: runs to completion, prints final energy.

Actual:

```
ValueError: initial_basis must be integer or bool dtype (binary occupations), got torch.float32
  at src/qvartools/methods/nqs/hi_nqs_sqd.py:377  (validation in run_hi_nqs_sqd)
```

The traceback ends inside the validation block that rejects float-dtype `initial_basis` tensors.

## Root cause

`FlowGuidedKrylovPipeline.extract_and_select_basis()` (src/qvartools/pipeline.py:441) returns a `float32` tensor, because it clones `self.trainer.accumulated_basis` which lives in float (for gradient tracking during NF training). The values happen to be in `{0.0, 1.0}`, but the dtype is wrong for the `initial_basis` contract.

`run_hi_nqs_sqd(initial_basis=...)` and `run_hi_nqs_skqd(initial_basis=...)` strictly validate `initial_basis.dtype` to be integer or bool — float32 is rejected with the ValueError above.

The 007 scripts wire `pipeline.extract_and_select_basis()` directly into `run_hi_nqs_sqd(initial_basis=...)` with no cast:

```python
# 07_iterative_nqs_dci/iter_nqs_dci_sqd.py:162, 182
basis = pipeline.extract_and_select_basis()
...
nqs_result = run_hi_nqs_sqd(
    hamiltonian, mol_info, config=sqd_config, initial_basis=basis  # <-- float32
)
```

## Why 008 works (parallel pattern, no bug)

The sister scripts in `08_iterative_nqs_dci_pt2/` do the same `extract_and_select_basis` but pipe the result through `expand_basis_via_connections()` before `run_hi_nqs_sqd`, and that function happens to cast the tensor to long internally. 007 doesn't have that intermediate step, so it hits the raw dtype mismatch.

## Fix

Script-level `.long()` cast on the `basis` tensor before passing as `initial_basis`. See PR (to be linked) for the patch — 2 files, 2-line changes each plus explanatory comments.

## Verification after fix

Both scripts on H2/CPU:

| Script | Wall time | Error | Chem.Acc |
|---|---|---|---|
| `iter_nqs_dci_sqd.py` | 11.88 s | 0.0 mHa | YES |
| `iter_nqs_dci_krylov_classical.py` | 9.32 s | 0.0 mHa | YES |

## Related architectural concern (out of scope)

The deeper issue is that `extract_and_select_basis()` returns `float32` at a public API boundary, while every downstream runner that accepts `initial_basis` expects integer/bool. Two cleaner long-term fixes, either of which would eliminate this class of bug permanently:

1. **Library fix**: cast to `long` at the end of `extract_and_select_basis()` — semantically correct since the values are binary occupations. Low risk since 008 + similar pipelines only use the tensor for indexing.
2. **Lenient validator**: have `validate_initial_basis` accept float dtype when all values are exactly `{0,1}` and auto-cast. More forgiving but may hide future misuse.

This issue tracks the script-level quick fix. A follow-up issue could track the architectural cleanup.

## Discovery context

Surfaced during end-to-end smoke testing of PR #39 (`refactor/pipeline-catalog`, 3-digit catalog rename + 010-013 method-as-pipeline entries). Verified via stash-checkout-unstash test that the bug exists on `main` at the same point — not caused by that PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: 07_iterative_nqs_dci/iter_nqs_dci_{sqd,krylov_classical}.py fail with initial_basis dtype ValueError #40

Summary

Affected scripts

Reproduction

Root cause

Why 008 works (parallel pattern, no bug)

Fix

Verification after fix

Related architectural concern (out of scope)

Discovery context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Script	Wall time	Error	Chem.Acc
`iter_nqs_dci_sqd.py`	11.88 s	0.0 mHa	YES
`iter_nqs_dci_krylov_classical.py`	9.32 s	0.0 mHa	YES

bug: 07_iterative_nqs_dci/iter_nqs_dci_{sqd,krylov_classical}.py fail with initial_basis dtype ValueError #40

Description

Summary

Affected scripts

Reproduction

Root cause

Why 008 works (parallel pattern, no bug)

Fix

Verification after fix

Related architectural concern (out of scope)

Discovery context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions