Skip to content

GEOPY-2781: Inversion stalls on tiling for large problems during redistribution of clusters#365

Open
domfournier wants to merge 10 commits intodevelopfrom
GEOPY-2781
Open

GEOPY-2781: Inversion stalls on tiling for large problems during redistribution of clusters#365
domfournier wants to merge 10 commits intodevelopfrom
GEOPY-2781

Conversation

@domfournier
Copy link
Copy Markdown
Collaborator

@domfournier domfournier commented Mar 24, 2026

GEOPY-2781 - Inversion stalls on tiling for large problems during redistribution of clusters

Copilot AI review requested due to automatic review settings March 24, 2026 22:25
@github-actions github-actions bot changed the title GEOPY-2781 GEOPY-2781: Inversion stalls on tiling for large problems during redistribution of clusters Mar 24, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the location-tiling behavior used to partition survey locations into tiles, apparently to remove the previous “even population” rebalancing step (which relied on linear_sum_assignment) and to adjust tests accordingly.

Changes:

  • Simplifies tile_locations() to always use raw KMeans cluster labels (removing the redistribution/balancing step).
  • Disables the tile-population balancing test by commenting it out and adding TODO notes.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
tests/locations_test.py Comments out the population-balancing test for tile_locations() and adds TODO notes about a future scalable balancing approach.
simpeg_drivers/utils/nested.py Removes the Hungarian-assignment-based rebalancing logic; tile_locations() now returns KMeans labels directly.
Comments suppressed due to low confidence (1)

simpeg_drivers/utils/nested.py:545

  • When sorting is provided, grid_locs is permuted before fitting KMeans, but the returned tile indices are positions in that permuted array. Downstream slicing (e.g., create_survey() filters survey.ordering[:, 2] against the provided indices) expects receiver IDs in the original indexing used by ordering (often the geoh5/receiver index, not the permuted position). Please map the clustered indices back through sorting before returning (or avoid permuting grid_locs and instead pass weights/ordering differently), so tiles reference the same index space as survey.ordering.
    cluster_id = kmeans.labels_

    tiles = []
    for tid in set(cluster_id):
        tiles += [np.where(cluster_id == tid)[0]]


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.83%. Comparing base (b1ad7f4) to head (61854bb).
⚠️ Report is 7 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #365      +/-   ##
===========================================
- Coverage    89.85%   89.83%   -0.02%     
===========================================
  Files          125      125              
  Lines         6437     6425      -12     
  Branches       794      793       -1     
===========================================
- Hits          5784     5772      -12     
  Misses         449      449              
  Partials       204      204              
Files with missing lines Coverage Δ
simpeg_drivers/utils/nested.py 95.16% <100.00%> (-0.27%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants