Skip to content

feat: add generic search client traits and adapters#994

Open
bitner wants to merge 4 commits intomainfrom
searchstream
Open

feat: add generic search client traits and adapters#994
bitner wants to merge 4 commits intomainfrom
searchstream

Conversation

@bitner
Copy link
Collaborator

@bitner bitner commented Mar 18, 2026

Description

Introduce a family of search client traits with blanket implementations and adapter utilities, replacing ad-hoc per-backend boilerplate with a consistent, extensible design.

New traits (stac::api)

Trait Purpose Required method
ItemsClient Single-page item search search
StreamItemsClient Stream items across all pages search_stream
CollectionsClient Fetch all collections at once collections
PagedCollectionsClient Cursor-paginated collections (future-proofing) collections_page
StreamCollectionsClient Stream all collections collections_stream
ArrowItemsClient (geoarrow) Arrow record batch output search_to_arrow
TransactionClient Write items and collections add_item, add_collection

Blanket implementations

  • CollectionsClient + Clone + Sync → StreamCollectionsClient — eagerly fetches all collections and yields as a stream; no wrapper struct needed
  • ArrowItemsClient + Sync → ItemsClient + StreamItemsClient (geoarrow feature) — collects record batches synchronously and returns owned items

Adapter utilities

  • PagedItemsStream\<T> — wraps any ItemsClient to provide StreamItemsClient via token/skip pagination (ItemCollection::next)
  • stream_pages_generic — free function driving the items pagination loop; used by PagedItemsStream and all server backends
  • stream_pages_collections_generic — collections equivalent for PagedCollectionsClient backends (ready for future paginated /collections support)
  • RecordBatchReaderAdapter\<I> (geoarrow) — bridges Iterator<Item = Result<RecordBatch, E>> to arrow_array::RecordBatchReader

Backend implementations

All three server backends (memory, duckdb, pgstac) implement the full trait family. stac-duckdb provides HrefClient (ArrowItemsClient) and SyncHrefClient (ItemsClient + CollectionsClient + StreamItemsClient via Mutex). stac-io's Client implements StreamItemsClient with HATEOAS link-following rather than token pagination.

Design notes

  • The ItemsClient + Clone → StreamItemsClient blanket cannot be added because it would overlap with the ArrowItemsClient blanket under Rust's coherence rules. Server backends use stream_pages_generic directly in their explicit StreamItemsClient impls.
  • PagedCollectionsClient has no blanket StreamCollectionsClient for the same reason (would overlap with the CollectionsClient blanket). Paginated backends call stream_pages_collections_generic in their own impl.

Related issues

  • Groundwork for federated search / streaming backends

Checklist

  • Unit tests
  • Documentation, including doctests
  • Pull request title follows conventional commits
  • Pre-commit hooks pass (prek run --all-files)

@bitner bitner requested a review from gadomski as a code owner March 18, 2026 19:33
Introduce a family of search client traits with blanket implementations
and adapter utilities to reduce boilerplate across backends.

New traits
----------
- ItemsClient: single-page item search (search required); defaults: item, items
- StreamItemsClient: streaming items across pages (search_stream required);
  defaults: collect_items, item_count, items_stream
- CollectionsClient: fetch all collections (collections required);
  default: collection point-lookup
- PagedCollectionsClient: cursor-paginated collections (collections_page required)
  for future backends that support paginated /collections
- StreamCollectionsClient: streaming collections (collections_stream required);
  default: collect_collections
- ArrowItemsClient (geoarrow feature): Arrow record batch output
  (search_to_arrow required); default: items_to_arrow
- TransactionClient: write operations

Blanket implementations
-----------------------
- CollectionsClient + Clone + Sync -> StreamCollectionsClient (eager fetch)
- ArrowItemsClient + Sync -> ItemsClient + StreamItemsClient (geoarrow feature)

Adapter utilities
-----------------
- PagedItemsStream<T>: wraps ItemsClient to provide StreamItemsClient via
  token/skip pagination using ItemCollection::next
- stream_pages_generic: free function driving the pagination loop
- stream_pages_collections_generic: collections equivalent for
  PagedCollectionsClient backends
- RecordBatchReaderAdapter<I> (geoarrow feature): bridges any
  Iterator<Item = Result<RecordBatch, E>> to arrow_array::RecordBatchReader

Documentation
-------------
Add docs/search-clients.md covering the trait family, blanket impls, adapter
conversion chart, pagination mechanics, and performance notes.
bitner and others added 3 commits March 19, 2026 10:57
- gate async stream traits/exports behind the core async feature

- rename pagination helpers to stream_pages and stream_pages_collections

- move search client guidance into api module docs and remove duplicate docs page

- unify arrow adapter errors under crate::Error and stream record batches incrementally

- replace mock-heavy adapter tests with real MemoryBackend streaming tests
@gadomski gadomski self-requested a review March 23, 2026 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants