Skip to content

Skip wasted cs-level table copy in render_shapes/render_labels#680

Merged
timtreis merged 2 commits into
mainfrom
fix/issue-415-skip-wasted-table-filter
May 21, 2026
Merged

Skip wasted cs-level table copy in render_shapes/render_labels#680
timtreis merged 2 commits into
mainfrom
fix/issue-415-skip-wasted-table-filter

Conversation

@timtreis
Copy link
Copy Markdown
Member

@timtreis timtreis commented May 21, 2026

Summary

Closes #415. _render_shapes and _render_labels were calling filter_by_coordinate_system(filter_tables=True), which copies the table's sparse .X via _filter_table_by_element_names. The filtered table was then immediately discarded — overwritten by join_spatialelement_table (shapes) or by per-element subset + match_table_to_element (labels via _set_color_source_vec).

For the multi-sample Visium use case in #415 (49 elements sharing one ~200k-row sparse table) this copy dominated the render call. The micro-benchmark in the issue investigation shows ~5.2× speedup on a synthetic 98k×3k sparse table; the gap scales with table size.

Switching both call sites to filter_tables=False is safe because:

  • sdata_filt[table_name] is never read between the filter call and the point where it is overwritten (verified by grep + walk of render.py).
  • sdata_filt.tables is never iterated by any code on the affected paths.
  • Auto-resolution helpers in utils.py that iterate sdata.tables are only triggered when table_name is None, which already used filter_tables=False.

Unintended side effect, caught & fixed

_filter_table_by_element_names mutates sdata's table obs in place via table.obs = pd.DataFrame(table.obs), which triggers AnnData's implicit int64→str index coercion. Removing the cs filter exposes AnnData's strict _normalize_index rejection of non-RangeIndex int64 obs indices in the EntityID-collision workaround path (test_render_shapes_color_with_conflicting_index_name).

The fix extends the existing workaround in _render_shapes (originally added for scverse/spatialdata#1099): when a name collision is detected, also swap the obs index to a fresh RangeIndex, saved and restored under try/finally.

`_render_shapes` and `_render_labels` were calling
`filter_by_coordinate_system(filter_tables=True)`, which copied the
table's sparse `.X` via `_filter_table_by_element_names`. The filtered
table was then immediately discarded:

- Shapes: overwritten by `join_spatialelement_table` (per-element).
- Labels: re-sliced by region, then `_set_color_source_vec` calls
  `match_table_to_element` (per-element).

So the cs-level copy was pure waste — multi-second for the consolidated
multi-sample Visium tables reported in the issue. Switch both call sites
to `filter_tables=False`.

Side effect that surfaced: `_filter_table_by_element_names` also
mutated `sdata`'s table obs in place via `table.obs = pd.DataFrame(...)`,
triggering AnnData's implicit int64→str index coercion. Dropping the
cs filter exposed AnnData's strict `_normalize_index` rejection of
non-RangeIndex int64 obs indices in the EntityID-collision workaround
path (test_render_shapes_color_with_conflicting_index_name). Extended
the existing workaround to also swap to a RangeIndex (saved/restored
under try/finally).
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 21, 2026

Codecov Report

❌ Patch coverage is 87.50000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 77.81%. Comparing base (8a6a33f) to head (bb4c2e2).

Files with missing lines Patch % Lines
src/spatialdata_plot/pl/render.py 87.50% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #680      +/-   ##
==========================================
+ Coverage   77.79%   77.81%   +0.01%     
==========================================
  Files          11       11              
  Lines        3693     3700       +7     
  Branches      877      879       +2     
==========================================
+ Hits         2873     2879       +6     
- Misses        490      491       +1     
  Partials      330      330              
Files with missing lines Coverage Δ
src/spatialdata_plot/pl/render.py 86.98% <87.50%> (-0.13%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Address simplify-review nits: cs-filter comments were over-explaining
what the code did; collapse to single-line WHY. Flatten the nested
collision/Range-index check into an elif cascade.
@timtreis timtreis merged commit 8b1baae into main May 21, 2026
7 of 8 checks passed
@timtreis timtreis deleted the fix/issue-415-skip-wasted-table-filter branch May 21, 2026 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance issue when plotting spatialdata object with many samples

2 participants