Skip to content

research-direction: empirically test whether NQS feedback escapes the Mazzola coupon-collector flaw (thin niche) #53

@thc1006

Description

@thc1006

TL;DR

Mazzola et al. JCTC 2025 proved QSCI/SQD has fatal "coupon collector" sampling inefficiency — repeatedly samples same configs, fails to find rare-but-important ones. They tested flat quantum sampling. Our HI-NQS-SQD has iterative NQS feedback. Question: does iterative feedback shift the coupon-collector regime? Empirically untested in literature.

TL;DR-2

We have data already that can answer this. Need: 1 small analysis paper + 2-3 supplementary jobs.

Background

Mazzola critique:

  • QSCI sample distribution = ground state amplitude squared
  • Coupon collector with skewed distribution → coverage cost grows superpolynomially
  • Sample wastage on already-seen configs
  • QSCI expansions less compact than classical SCI heuristics
  • Tested on N2 and [2Fe-2S]

Hypothesis (untested in literature)

HI-NQS-SQD iterates: NQS resampled at each iter from updated eigenvector marginals. This changes the underlying sampling distribution iteratively.

Three possible outcomes:

  • (A) Escapes Mazzola: NQS focus → polynomial (or sub-Mazzola-power) scaling of unique determinants per iter
  • (B) Worse than Mazzola: NQS over-focuses → misses rare configs faster than flat sampling
  • (C) Equivalent: Iterative feedback adds nothing fundamental

Available data (already in our hands)

  • 52Q-AUGMENT: per-iter n_unique_new trajectory for NQS feedback case
  • 52Q-N5K-Random: per-iter n_unique_new for flat random sampling (baseline)
  • Per-iter coeff_histogram showing |c|² tail dynamics

These let us plot:

  • log(n_unique_per_iter) vs iter for NQS vs random
  • Coupon-collector signature comparison
  • Compactness comparison (n_configs to reach given accuracy)

What's missing

  • N2-CAS-12 / 15 / 17 jobs to fill scaling axis (small CAS faster, ~30 min each on H200)
  • Cr2 datapoint (multireference contrast)
  • Standardized analysis script

Output: short workshop paper

Title: "Iterative neural-network feedback in NQS-SQD pipelines: does it escape the QSCI sampling bottleneck?"

  • ~6 pages
  • Clear hypothesis + falsification framework
  • Empirical analysis of our existing + small new data
  • Either confirms Mazzola applies + provides empirical scaling exponents, or shows escape
  • Workshop venue (NeurIPS ML4Sci, AAAI QSciML, etc.)

What this is NOT

Not a new method paper. Not a chem-acc claim. A specific case-study within the Mazzola framework. Modest contribution but original empirical observation.

Risk

  • Result could be uninteresting (exact equivalence to Mazzola for our pipeline)
  • Other groups may publish similar analysis first — monitor arXiv
  • Reviewers may say "case study insufficient" — counter with "Mazzola critique demanded empirical follow-up"

Effort: 6-8 weeks

  • 1 week: standardize analysis on existing data
  • 2 weeks: run supplementary jobs (CAS-12/15/17/Cr2)
  • 2-3 weeks: writing + figures

Dependency

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions