Skip to content

Switch internal hashmap layout to be more efficient#2709

Open
t-b wants to merge 7 commits into
mainfrom
bugfix/2709-fewer-memory-for-hashmap
Open

Switch internal hashmap layout to be more efficient#2709
t-b wants to merge 7 commits into
mainfrom
bugfix/2709-fewer-memory-for-hashmap

Conversation

@t-b
Copy link
Copy Markdown
Collaborator

@t-b t-b commented May 12, 2026

  • Add upgrade paths for cache keys and labnotebook keys
  • Look into coverage for MIES_Hashmap.ipf
  • Polish commits

Close #2695

t-b added 2 commits May 12, 2026 18:01
Created with

diff --git a/Packages/tests/Basic/UTF_Hashmap.ipf b/Packages/tests/Basic/UTF_Hashmap.ipf
index b4be5bc..fbe056292c 100644
--- a/Packages/tests/Basic/UTF_Hashmap.ipf
+++ b/Packages/tests/Basic/UTF_Hashmap.ipf
@@ -340,6 +340,9 @@ static Function DeleteEntryWorks()
     value = "mnop"
     HM_AddEntry(hashmap, key, str = value)

+    WAVE hashmap_text_version0 = DeepCopyWaveRefWave(hashmap)
+    Duplicate/O hashmap_text_version0, root:hashmap_text_version0
+
     WAVE/Z results = GetFilledEntries(hashmap)
     CHECK_WAVE(results, NUMERIC_WAVE)
     CHECK_EQUAL_WAVES(results, {6100}, mode = WAVE_DATA)
@@ -577,6 +580,9 @@ static Function WorksWithNumericValues([variable var])
     CHECK(found)
     CHECK_EQUAL_VAR(value, 30)

+    WAVE hashmap_numeric_version0 = DeepCopyWaveRefWave(hashmap)
+    Duplicate/O hashmap_numeric_version0, root:hashmap_numeric_version0
+
     ret = HM_DeleteEntry(hashmap, key)
     CHECK_EQUAL_VAR(ret, 0)

and moved into root:Hashmap.
@t-b t-b self-assigned this May 12, 2026
Copilot AI review requested due to automatic review settings May 12, 2026 16:17
@t-b t-b changed the title Switch internal hashmap layout Switch internal hashmap layout to be more efficient May 12, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates MIES’ internal hashmap representation to reduce the number of per-bucket waves (a key factor in slow experiment loading with very large hashmaps) and introduces wave-layout versioning plus an explicit upgrade function to migrate existing stored hashmaps.

Changes:

  • Switch hashmap storage to a single keys wave and a single values wave (2D, buckets as rows and entries as columns).
  • Add wave layout versioning and an HM_Upgrade() migration path from the old per-bucket-wave layout.
  • Update and extend hashmap unit tests to reflect the new internal layout and exercise upgrades.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 7 comments.

File Description
Packages/tests/Basic/UTF_Utils_Mies_Algorithm.ipf Adds additional INFO output in an algorithm test.
Packages/tests/Basic/UTF_Hashmap.ipf Updates hashmap tests for the new internal structure and adds upgrade tests.
Packages/MIES/MIES_WaveDataFolderGetters.ipf Exposes SetWaveVersion and adds placeholders for upgrading persisted cache/logbook hashmaps.
Packages/MIES/MIES_Hashmap.ipf Implements the new internal hashmap layout, adds HM_Upgrade, and adjusts core operations accordingly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Packages/MIES/MIES_Hashmap.ipf Outdated
Comment thread Packages/MIES/MIES_WaveDataFolderGetters.ipf
Comment thread Packages/tests/Basic/UTF_Hashmap.ipf Outdated
Comment thread Packages/MIES/MIES_Hashmap.ipf
Comment thread Packages/MIES/MIES_WaveDataFolderGetters.ipf Outdated
Comment thread Packages/tests/Basic/UTF_Hashmap.ipf
Comment thread Packages/tests/Basic/UTF_Hashmap.ipf
@t-b t-b force-pushed the bugfix/2709-fewer-memory-for-hashmap branch from 7afc66e to d9e65e2 Compare May 12, 2026 20:15
Copilot AI review requested due to automatic review settings May 12, 2026 20:28
@t-b t-b force-pushed the bugfix/2709-fewer-memory-for-hashmap branch from d9e65e2 to 0512d66 Compare May 12, 2026 20:28
@t-b t-b marked this pull request as ready for review May 12, 2026 20:28
@t-b t-b requested review from MichaelHuth and timjarsky as code owners May 12, 2026 20:28
@t-b t-b force-pushed the bugfix/2709-fewer-memory-for-hashmap branch from 0512d66 to 28e2231 Compare May 12, 2026 20:32
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

Packages/MIES/MIES_WaveDataFolderGetters.ipf:9037

  • After upgrading the stored logbook key hashmap, MoveWaveWithOverwrite returns a new wave reference for the destination path. The current code ignores that return value and then returns the pre-upgrade hashmap reference, which may be invalid because the destination wave was killed and recreated inside MoveWaveWithOverwrite. Assign the return value (or re-resolve the wave by name) and return the updated reference.
	if(ExistsWithCorrectLayoutVersion(hashmap, HM_HASHMAP_WAVE_VERSION))
		return hashmap
	elseif(WaveExists(hashmap))
		WAVE hashmap_new = HM_Upgrade(hashmap)

		MoveWaveWithOverwrite(hashmap, hashmap_new)

		return hashmap

Comment thread Packages/MIES/MIES_WaveDataFolderGetters.ipf Outdated
Comment thread Packages/MIES/MIES_Hashmap.ipf Outdated
@t-b t-b force-pushed the bugfix/2709-fewer-memory-for-hashmap branch from 28e2231 to d198b28 Compare May 12, 2026 20:36
@t-b t-b requested a review from Copilot May 12, 2026 20:37
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.

Comment thread Packages/MIES/MIES_Hashmap.ipf
@t-b t-b force-pushed the bugfix/2709-fewer-memory-for-hashmap branch from d198b28 to b66f9c0 Compare May 12, 2026 20:57
@t-b t-b requested a review from Copilot May 12, 2026 20:58
@t-b t-b force-pushed the bugfix/2709-fewer-memory-for-hashmap branch from b66f9c0 to 25427bb Compare May 12, 2026 21:00
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated no new comments.

t-b added 3 commits May 12, 2026 23:31
Production data revealed that caches with ~130k entries result in pxp
files which take a large amount of time to save and load.

Digging revealed that this is due the sheer number of free waves we are
using for a hashmap. We have 2 free waves per bucket, so for 130k entries
we have 260k free waves.

In addition the overhead for small waves (typically with single digit
sizes) is quite large:

•make/n=2/t txt
•print waveinfo(txt, 0)
  NUMTYPE:0;DUNITS:;XUNITS:;MODIFIED:1;FULLSCALE:0,0,0;MODTIME:3860315657;PATH:;LOCK:0;HGETSTATE:0;SIZEINBYTES:664;
•make/n=0/t/O txt
•print waveinfo(txt, 0)
  NUMTYPE:0;DUNITS:;XUNITS:;MODIFIED:1;FULLSCALE:0,0,0;MODTIME:3860315683;PATH:;LOCK:0;HGETSTATE:0;SIZEINBYTES:648;
•make/n=(2, 150e3)/t/O txt
•print waveinfo(txt, 0)
  NUMTYPE:0;DUNITS:;XUNITS:;MODIFIED:1;FULLSCALE:0,0,0;MODTIME:3860315727;PATH:;LOCK:0;HGETSTATE:0;SIZEINBYTES:2400648;

We therefore change the internal hashmap layout to consist of one
keys wave and one values wave. Both are 2D and the rows are indexed by
bucket index and the columns by key/value pairs.

As there is no gain in using FindValue on a 2D wave when searching the
columns with constant row index for a small number of entries we now
always use the fast path in HM_GetKeyIndex.
@t-b t-b force-pushed the bugfix/2709-fewer-memory-for-hashmap branch from 25427bb to 8606fbc Compare May 12, 2026 21:32
@t-b t-b requested a review from Copilot May 12, 2026 21:47
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.

Comment thread Packages/MIES/MIES_Hashmap.ipf
@t-b t-b assigned Garados007 and unassigned t-b May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Opening files with large hashmaps is really slow

3 participants