Add Stochastic Lanczos Quadrature example for von Neumann entropy by akaiHuang · Pull Request #1422 · ml-explore/mlx-examples

akaiHuang · 2026-05-05T14:30:04Z

Summary

Adds a self-contained example demonstrating Stochastic Lanczos Quadrature (Ubaru, Chen & Saad 2017) for estimating Tr[f(A)] on the Metal GPU. The application is the von Neumann entropy S(rho) = -Tr[rho ln rho] of an N x N density matrix — a workhorse quantity in quantum information, lattice field theory, and quantum machine-learning regularisers.

The standard eigh-based path costs O(N^3); SLQ replaces it with m independent k-step Lanczos recurrences, costing O(k * m * N^2) matvecs. All O(N^2) work runs on the Metal GPU; the inner k x k tridiagonal eigh runs on NumPy (k <= 30, dispatch overhead beats GPU at that size).

Why this fits mlx-examples

Showcases an idiom not present anywhere else in the repo: trace estimators for matrix functions on MLX.
The same pattern (Lanczos + Gaussian quadrature on the small tridiagonal) is reusable for log det A, Tr[exp(A)], etc.
Self-contained: only depends on mlx and numpy (no extra ML framework, no pretrained weights).
~150 lines of pedagogical Lanczos in slq.py; main.py adds an eigh reference for accuracy comparison and a small CLI.

Numbers (M1 Max, k=25, m=20)

N	S_exact	S_slq	rel_err	t_exact (s)	t_slq (s)
100	4.10	4.07	0.7%	0.00	0.28
500	5.71	5.74	0.4%	0.01	0.28
1000	6.41	6.43	0.3%	0.10	0.29
2000	7.10	7.02	1.2%	0.88	0.31
4000	7.79	7.76	0.4%	8.60	0.38

t_exact is numpy.linalg.eigvalsh on float64; t_slq is the MLX path on the Metal GPU. At N = 4000 the example is ~22x faster than the CPU eigh reference. This is the pedagogical version — sequential over probes, no mx.compile fusion; a production version with batched probes lives in mlx-qre on PyPI and pushes the speedup another order of magnitude past the SLQ shown here.

Test plan

`python main.py --sizes 100 500 1000` — completes in seconds, all rel_err < 1.5%
`python main.py --sizes 4000` — completes in <10s and shows clear win over `eigh`
`slq.py` is `mlx >= 0.30` only; no `mx.fast.metal_kernel` or other recent-API surface
No external assets, no network access, no pretrained weights

References

S. Ubaru, J. Chen & Y. Saad, Fast estimation of tr(f(A)) via stochastic Lanczos quadrature, SIAM J. Matrix Anal. Appl. 38(4), 1075-1099 (2017).

This example demonstrates SLQ (Ubaru, Chen & Saad 2017) for estimating Tr[f(A)] with f(x) = x ln x on the Metal GPU. The application is the von Neumann entropy S(rho) = -Tr[rho ln rho] of an N x N density matrix. The standard eigh-based path costs O(N^3); SLQ replaces it with m independent k-step Lanczos recurrences for O(k * m * N^2) matvecs. At N = 4000 the example achieves ~22x speedup over NumPy eigh with 0.4% relative error vs the float64 reference. Files: - slq.py pedagogical Lanczos + Gaussian-quadrature core - main.py benchmark harness (eigh vs SLQ across N) - README.md math, expected output, when to reach for SLQ - requirements.txt mlx + numpy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Stochastic Lanczos Quadrature example for von Neumann entropy#1422

Add Stochastic Lanczos Quadrature example for von Neumann entropy#1422
akaiHuang wants to merge 1 commit into
ml-explore:mainfrom
akaiHuang:add-quantum-relative-entropy

akaiHuang commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

akaiHuang commented May 5, 2026

Summary

Why this fits mlx-examples

Numbers (M1 Max, k=25, m=20)

Test plan

References

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant