Reduce cache footprint by decoupling degeneracy-dependent data by lkdvos · Pull Request #387 · QuantumKitHub/TensorKit.jl

lkdvos · 2026-03-23T19:15:25Z

This PR refactors the internal representation of the fusion tree structure (block layout and sub-block indexing) for TensorMap.
The two main changes are:

Replace the parallel fusiontreelist + fusiontreestructure arrays with a single Dictionaries.Dictionary, that can be efficiently used both through sequential access via the token system, as well as via hashing.
Share the (separately-cached) Indices for the dictionary across spaces that share the same sectors but differ in degeneracies only.

Motivation and context

Previously, FusionBlockStructure stored block layout information using three parallel data structures:

fusiontreelist: a Vector of (f₁, f₂) fusion tree pairs (the canonical order)
fusiontreeindices: a Dict{(f₁, f₂), Int} for O(1) lookup by key
fusiontreestructure: a Vector{StridedStructure} indexed positionally

This split was necessary to support both sequential access (iteration in canonical order) and keyed access (looking up a sub-block by fusion tree pair).
The drawback is redundancy: the tree pairs are stored twice.
Additionally, many operations require multiple indirections that are handled manually throughout the package. (fusiontreeindices → index → fusiontreestructure).

This is precisely what Dictionaries.jl solves, as this is more or less exactly mapped to the internal structure of the Dictionary type.
The gettoken function maps keys to integers, and gettokenvalue then simply uses that integer to index into the vector of values.
This effectively replaces all three structures with a single Dictionary{typeof(((f₁,f₂)), StridedStructure}.

Additionally, fusiontrees(t) / fusiontrees(W) — which enumerate valid fusion tree pairs — benefits from caching, but the cache key is the sector structure of the space (not its degeneracy dimensions, which affect sub-block sizes but not the set of valid trees).
The Indices type from Dictionaries.jl serves as the keyed-ordered set of fusion tree pairs that can be shared across HomSpaces with identical sector structure. (the combination of fusiontreelist and fusiontreeindices from before)

Design Decisions

fusiontreelist is cached by sector structure:

fusiontreelist(W) uses a custom Hashed wrapper to hash/compare HomSpaces only by their sector structure (ignoring degeneracy dimensions).
This solves the issue that the set of valid fusion tree pairs (f₁, f₂) depends only on which sectors appear in each index space and their dualities — not on how many states each sector has.
By caching the fusiontreelist at this coarser level, HomSpaces that share the same sectors but differ in multiplicities can share the same Indices.

File reorganization: tensorstructure.jl

As the HomSpace file was getting somewhat large, I also refactored and split off the functions that construct tensor structure into their own file, included just before abstracttensor.jl.
This code is really about tensor data layout, not about the abstract space itself.

Questions

Should we just switch out all dictionaries for Dictionaries.jl-based options, to avoid the mental load of having both and the maintenance of supporting different dictionaries defined within TensorKit?
Are there other abstraction points that should get this kind of treatment?

lkdvos added 7 commits March 23, 2026 09:17

add utility to customize hashing/equality

48427d0

split out fusionblockstructure and fusiontreelist

d038f96

use new split throughout the code

42d6a0c

reorganize tensor structure computations

180ae2b

update docstrings

406db6e

switch to Dictionaries.Indices

f4b2a56

switch even more to Dictionaries.Dictionary

80506df

lkdvos linked an issue Mar 23, 2026 that may be closed by this pull request

fusionblockstructure should reuse data that is degeneracy-independent #384

Open

lkdvos added 2 commits March 23, 2026 17:02

remove fusiontreelist

dfbcb87

avoid storing fusiontrees in fusionblockstructure

0fac8ac

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce cache footprint by decoupling degeneracy-dependent data#387

Reduce cache footprint by decoupling degeneracy-dependent data#387
lkdvos wants to merge 9 commits intomainfrom
ld-caching

lkdvos commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lkdvos commented Mar 23, 2026

Motivation and context

Design Decisions

Questions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant