Skip to content

API for CPU side output indices and distances of all_neighbors#1905

Open
jinsolp wants to merge 4 commits intorapidsai:mainfrom
jinsolp:all-neigh-cpu-output-api
Open

API for CPU side output indices and distances of all_neighbors#1905
jinsolp wants to merge 4 commits intorapidsai:mainfrom
jinsolp:all-neigh-cpu-output-api

Conversation

@jinsolp
Copy link
Contributor

@jinsolp jinsolp commented Mar 10, 2026

Closes #1902

This PR allows indices/distances result of the all_neighbors function to be on host memory. Focuses on exposing the API, and internally just does a copy.

Internally, it still allocates GPU memory for the full indices/distances

  • For n_clusters=1 this is not a problem. If the user faces OOM issues they should use n_clusters>1.
  • For n_clusters>1 we are still using managed memory regardless of the memory location of the user given indices/distances. Will open a follow-up PR with optimizations (related issue: [FEA] Improve batched all-neighbors given CPU indices/distances #1903)

@jinsolp jinsolp self-assigned this Mar 10, 2026
@jinsolp jinsolp requested review from a team as code owners March 10, 2026 20:48
@jinsolp jinsolp added the feature request New feature or request label Mar 10, 2026
@jinsolp jinsolp added the non-breaking Introduces a non-breaking change label Mar 10, 2026
Copy link
Member

@aamijar aamijar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR Jinsol!
So currently, with the old code the API only returns distances and indices on device and would have to manually copy to host if needed?
Now with this new change the copy step is hidden from the user?

raft::make_device_vector_view<const T>(global_distances.data_handle(), num_rows * k));
}

raft::resource::sync_stream(handle);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a sync because we need to wait for GPU to finish some task?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is removed!

@jinsolp
Copy link
Contributor Author

jinsolp commented Mar 21, 2026

@aamijar yes, so eventually all-neighbors will be further optimized for host side distances/indices!
I didn't want to make the PR too big, so this PR only contains the API exposure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Introduces a non-breaking change

Projects

Development

Successfully merging this pull request may close these issues.

[FEA] API for CPU side indices/distances in all-neighbors

2 participants