Skip to content

Paul.masurel/search metrics#6416

Draft
fulmicoton wants to merge 12 commits intomainfrom
paul.masurel/search_metrics
Draft

Paul.masurel/search metrics#6416
fulmicoton wants to merge 12 commits intomainfrom
paul.masurel/search_metrics

Conversation

@fulmicoton
Copy link
Copy Markdown
Collaborator

Description

Describe the proposed changes made in this PR.

How was this PR tested?

Describe how you tested this PR.

@fulmicoton-dd fulmicoton-dd force-pushed the paul.masurel/search_metrics branch 6 times, most recently from 09dced5 to 494c14e Compare May 11, 2026 13:43
@fulmicoton fulmicoton requested a review from Copilot May 11, 2026 13:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands Quickwit’s search observability by introducing richer resource statistics across split/leaf/root search execution, including per-split storage download volume, and by refining Lambda invocation metrics.

Changes:

  • Add a CountingStorage wrapper to measure per-request downloaded bytes and GET request counts.
  • Replace/extend search “resource stats” in the search protobuf and propagate aggregated stats to root search responses.
  • Track additional timing/counters in leaf search (permit wait time, wall time, lambda bottleneck) and refine Lambda metrics labeling.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
quickwit/quickwit-storage/src/lib.rs Exposes the new counting storage wrapper types from the storage crate.
quickwit/quickwit-storage/src/counting_storage.rs Adds CountingStorage + DownloadCounters to instrument storage reads.
quickwit/quickwit-serve/src/jaeger_api/rest_handler.rs Updates tests/struct construction for the new SearchResponse.resource_stats field.
quickwit/quickwit-search/src/service.rs Adds resource_stats to scroll responses (currently set to None).
quickwit/quickwit-search/src/scroll_context.rs Adapts scroll batching to the new (LeafSearchResponse, stats) return type.
quickwit/quickwit-search/src/root.rs Computes root-level resource stats and threads them into SearchResponse.
quickwit/quickwit-search/src/lib.rs Reworks resource-stats merging utilities for the new leaf/split stats model.
quickwit/quickwit-search/src/leaf.rs Collects per-split stats (downloads, timings), wall time, and lambda bottleneck signals.
quickwit/quickwit-search/src/leaf_cache.rs Overwrites cached responses’ stats to reflect cache-hit counters.
quickwit/quickwit-search/src/collector.rs Switches resource stat merging to the new leaf/split resource stats types.
quickwit/quickwit-search/src/cluster_client.rs Tracks leaf-search call counts including retries; merges updated stats on retry.
quickwit/quickwit-proto/src/codegen/quickwit/quickwit.search.rs Updates generated Rust types for the modified search protobuf messages.
quickwit/quickwit-proto/protos/quickwit/search.proto Introduces SplitResourceStats, LeafResourceStats, RootResourceStats; adds SearchResponse.resource_stats.
quickwit/quickwit-proto/build.rs Ensures search.proto changes retrigger codegen; adds clippy attribute for large enum variant.
quickwit/quickwit-lambda-client/src/metrics.rs Renames Lambda metrics label key from status to outcome.
quickwit/quickwit-lambda-client/src/invoker.rs Classifies Lambda invocation outcomes (`success
quickwit/quickwit-jaeger/src/lib.rs Updates tests/struct construction for the new SearchResponse.resource_stats field.
quickwit/quickwit-common/src/lib.rs Adds is_running_in_lambda() helper used by leaf search instrumentation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +57 to +61
/// Wrap a base `Storage` with this proxy at the entry point of a request to
/// observe the per-request download volume. Cached layers (split cache, footer
/// cache, hotcache, byte-range cache) live BELOW this wrapper, so reads served
/// from cache do NOT contribute to the counters — that is the desired behavior
/// for a "downloaded from object storage" measurement.
Comment on lines +535 to +538
// Wrap storage at request scope so we observe per-split download volume.
// Cache layers (split cache, footer cache, hotcache, byte-range cache) live
// BELOW this wrapper, so reads served from cache do not contribute to the
// counters — that is the desired "downloaded from object storage" semantics.
Comment on lines 1641 to 1650
let search_permit_futures = searcher_context
.search_permit_provider
.get_permits_with_offload(permit_sizes, offload_threshold)
.await;

let splits_to_run_on_lambda: Vec<(SplitIdAndFooterOffsets, SearchRequest)> =
splits.drain(search_permit_futures.len()..).collect();

let permit_requested_at = Instant::now();
let splits_to_run_locally: Vec<LocalSearchTask> = splits
Comment on lines +788 to +789
// Each entry is (leaf response, number of leaf attempts made to obtain it).
// Metadata-count responses are synthesised locally and contribute 0 attempts.
Comment thread quickwit/quickwit-search/src/root.rs Outdated
Comment on lines +818 to +820
let leaf_num_calls_without_retries: u64 =
leaf_search_call_including_retries.load(Ordering::Relaxed) as u64;

Comment on lines 44 to +56
leaf_search_requests_total: new_counter_vec(
"leaf_search_requests_total",
"Total number of Lambda leaf search invocations.",
"lambda",
&[],
["status"],
["outcome"],
),
leaf_search_duration_seconds: new_histogram_vec(
"leaf_search_duration_seconds",
"Duration of Lambda leaf search invocations in seconds.",
"lambda",
&[],
["status"],
["outcome"],
Comment on lines 109 to +113
self.search_request.search_after = Some(previous_last_hit);
let leaf_search_response: LeafSearchResponse = crate::root::search_partial_hits_phase(
searcher_context,
&self.indexes_metas_for_leaf_search,
&self.search_request,
&self.split_metadatas[..],
cluster_client,
)
.await?;
let (leaf_search_response, _root_resource_stats): (LeafSearchResponse, _) =
crate::root::search_partial_hits_phase(
searcher_context,
&self.indexes_metas_for_leaf_search,
aggregation_postcard: None,
failed_splits: scroll_context.failed_splits,
num_successful_splits: scroll_context.num_successful_splits,
resource_stats: None,
@fulmicoton-dd fulmicoton-dd force-pushed the paul.masurel/search_metrics branch 2 times, most recently from 58de558 to ec50de2 Compare May 11, 2026 16:13
@fulmicoton-dd fulmicoton-dd force-pushed the paul.masurel/search_metrics branch from ec50de2 to cf8145c Compare May 11, 2026 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants