Skip to content

Add Stochastic Lanczos Quadrature example for von Neumann entropy#1422

Open
akaiHuang wants to merge 1 commit into
ml-explore:mainfrom
akaiHuang:add-quantum-relative-entropy
Open

Add Stochastic Lanczos Quadrature example for von Neumann entropy#1422
akaiHuang wants to merge 1 commit into
ml-explore:mainfrom
akaiHuang:add-quantum-relative-entropy

Conversation

@akaiHuang
Copy link
Copy Markdown

Summary

Adds a self-contained example demonstrating Stochastic Lanczos Quadrature (Ubaru, Chen & Saad 2017) for estimating Tr[f(A)] on the Metal GPU. The application is the von Neumann entropy S(rho) = -Tr[rho ln rho] of an N x N density matrix — a workhorse quantity in quantum information, lattice field theory, and quantum machine-learning regularisers.

The standard eigh-based path costs O(N^3); SLQ replaces it with m independent k-step Lanczos recurrences, costing O(k * m * N^2) matvecs. All O(N^2) work runs on the Metal GPU; the inner k x k tridiagonal eigh runs on NumPy (k <= 30, dispatch overhead beats GPU at that size).

Why this fits mlx-examples

  • Showcases an idiom not present anywhere else in the repo: trace estimators for matrix functions on MLX.
  • The same pattern (Lanczos + Gaussian quadrature on the small tridiagonal) is reusable for log det A, Tr[exp(A)], etc.
  • Self-contained: only depends on mlx and numpy (no extra ML framework, no pretrained weights).
  • ~150 lines of pedagogical Lanczos in slq.py; main.py adds an eigh reference for accuracy comparison and a small CLI.

Numbers (M1 Max, k=25, m=20)

N S_exact S_slq rel_err t_exact (s) t_slq (s)
100 4.10 4.07 0.7% 0.00 0.28
500 5.71 5.74 0.4% 0.01 0.28
1000 6.41 6.43 0.3% 0.10 0.29
2000 7.10 7.02 1.2% 0.88 0.31
4000 7.79 7.76 0.4% 8.60 0.38

t_exact is numpy.linalg.eigvalsh on float64; t_slq is the MLX path on the Metal GPU. At N = 4000 the example is ~22x faster than the CPU eigh reference. This is the pedagogical version — sequential over probes, no mx.compile fusion; a production version with batched probes lives in mlx-qre on PyPI and pushes the speedup another order of magnitude past the SLQ shown here.

Test plan

  • `python main.py --sizes 100 500 1000` — completes in seconds, all rel_err < 1.5%
  • `python main.py --sizes 4000` — completes in <10s and shows clear win over `eigh`
  • `slq.py` is `mlx >= 0.30` only; no `mx.fast.metal_kernel` or other recent-API surface
  • No external assets, no network access, no pretrained weights

References

  • S. Ubaru, J. Chen & Y. Saad, Fast estimation of tr(f(A)) via stochastic Lanczos quadrature, SIAM J. Matrix Anal. Appl. 38(4), 1075-1099 (2017).

This example demonstrates SLQ (Ubaru, Chen & Saad 2017) for estimating
Tr[f(A)] with f(x) = x ln x on the Metal GPU. The application is the
von Neumann entropy S(rho) = -Tr[rho ln rho] of an N x N density matrix.

The standard eigh-based path costs O(N^3); SLQ replaces it with m
independent k-step Lanczos recurrences for O(k * m * N^2) matvecs.
At N = 4000 the example achieves ~22x speedup over NumPy eigh with
0.4% relative error vs the float64 reference.

Files:
  - slq.py           pedagogical Lanczos + Gaussian-quadrature core
  - main.py          benchmark harness (eigh vs SLQ across N)
  - README.md        math, expected output, when to reach for SLQ
  - requirements.txt mlx + numpy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant