Skip to content

Upgrade to upstream quickwit main; point tantivy dep at upgrade/tantivy-0.26#8

Merged
schenksj merged 192 commits into
mainfrom
upgrade/quickwit-upstream-main
Mar 30, 2026
Merged

Upgrade to upstream quickwit main; point tantivy dep at upgrade/tantivy-0.26#8
schenksj merged 192 commits into
mainfrom
upgrade/quickwit-upstream-main

Conversation

@schenksj
Copy link
Copy Markdown
Collaborator

Summary

  • Merges upstream/main (Apache-2.0) into our fork — catches up ~37 custom commits worth of upstream changes
  • Updates tantivy dependency from quickwit-oss/tantivy @ 545169c0d8 to indextables/tantivy @ ffe408f49 (our upgraded tantivy fork branch)
  • Tags tantivy4java-v0.33.2 applied at old baseline before this upgrade

Merge Conflict Resolutions

  • runtimes.rs: kept upstream's active disable_lifo_slot = true default
  • field_presence.rs: preserved our all-fast-fields optimization; fixed schemacontext.schema
  • lib.rs: kept both pub mod leaf_cache and added upstream's mod invoker
  • leaf.rs: adopted upstream's Arc<LeafSearchContext> param but reverted to flat params externally (LeafSearchContext is private)
  • search_permit_provider.rs: preserved our Sync mode; added actor_join_handle to Async variant; updated to upstream's RequestWithOffload message

License

  • Upstream reverted from AGPL back to Apache-2.0 in Jan 2025 (commit 43fc2e414)
  • Our fork is Apache-2.0 throughout — safe for all use

Test plan

  • cargo check -p quickwit-search passes (verified locally)
  • Full workspace cargo check passes (verified locally)
  • tantivy4java builds against this commit

🤖 Generated with Claude Code

fulmicoton and others added 30 commits September 3, 2025 15:20
…quickwit-oss#5875)

This changes the result for 3vcpu and 2vcpus.
In that case, dedicated one vcpus to non-blocking tasks was
judged overkill.

Co-authored-by: fulmicoton <paul.masurel@datadoghq.com>
* Remove `quickwit-lambda` package

* Fix warning
Bumps the github-actions group with 2 updates: [actions/github-script](https://github.com/actions/github-script) and [actions/setup-node](https://github.com/actions/setup-node).


Updates `actions/github-script` from 7 to 8
- [Release notes](https://github.com/actions/github-script/releases)
- [Commits](actions/github-script@v7...v8)

Updates `actions/setup-node` from 4 to 5
- [Release notes](https://github.com/actions/setup-node/releases)
- [Commits](actions/setup-node@v4...v5)

---
updated-dependencies:
- dependency-name: actions/github-script
  dependency-version: '8'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: actions/setup-node
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…oss#5890)

The following methods are used in order:
- from the `QW_NUM_CPUS` environment variable
- from the `KUBERNETES_LIMITS_CPU` environment variable
- from the operating system
-  default to 2.

Co-authored-by: fulmicoton <paul.masurel@datadoghq.com>
…(and the logging associated) (quickwit-oss#5897)

Co-authored-by: fulmicoton <paul.masurel@datadoghq.com>
* update tantivy

* include columnar compression, fix build

* fix build

---------

Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>
In particular, there is no need to parse TLS config
upon the reception of every connections.

Co-authored-by: fulmicoton <paul.masurel@datadoghq.com>
Co-authored-by: Paul Masurel <paul.masurel@datadoghq.com>
* Fix leaf list fields merging logic

Performs the removal of the `_dynamic.` prefix on field names before
sorting.

* Minor changes following CR

---------

Co-authored-by: fulmicoton <paul.masurel@datadoghq.com>
* handle fast field search

* add term set query tests
* add migration for in-metastore kv store

* store cluster identity

* test get-identity

---------

Co-authored-by: trinity Pointard <trinity.pointard@datadoghq.com>
Blocks were added to BlockList in completion order rather than sequential
order, causing data corruption. Zero pad block IDs and sort before committing
to ensure correct blob reconstruction.

Also improve the existing test that was only checking file, the test
would fail on main if integrity check was turned on.

This PR passes the unit test, also has been runnin in for last two days in
production without data corruption, prior to this PR - split was getting
corrupted every 45 minutes.
osyniakov and others added 25 commits March 5, 2026 11:59
* feat(obs): export obs as otel

* fix: typo

* fix: dd license tool
Navigate directly to `/ui` instead of relying on the root redirect,
making the tests resilient to changes in the root redirect target.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…uickwit-oss#6102)

* feat(es-compat): add index_filter support for field capabilities API

Implements index_filter parameter support for the ES-compatible
_field_caps endpoint, allowing users to filter field capabilities
based on document queries.

Changes:
- Add query_ast field to ListFieldsRequest and LeafListFieldsRequest protos
- Parse index_filter from ES Query DSL and convert to QueryAst
- Pass query_ast through to leaf nodes for future filtering support
- Add unit tests for index_filter parsing
- Add REST API integration tests

Note: This implementation accepts and parses the index_filter parameter
for API compatibility. Full split-level document filtering will be
added as a follow-up enhancement.

Closes quickwit-oss#5693

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: ruo <ruoliu.dev@gmail.com>

* feat(es-compat): implement split-level filtering for field_caps index_filter

Address PR review comments for index_filter support in _field_caps API:

- Extract `parse_index_filter_to_query_ast()` function with clean prototype
- Implement split-level filtering via `split_matches_query()` using
  lightweight `query.count()` execution (no document materialization)
- Add proper async handling with ByteRangeCache, warmup(), and
  run_cpu_intensive() for Quickwit's async-only storage
- Add metastore-level pruning:
  - Tag extraction via `extract_tags_from_query()`
  - Time range extraction via `refine_start_end_timestamp_from_ast()`
- Build DocMapper only when query_ast is provided (no overhead for
  common path without index_filter)
- Fix REST API tests: use `json:` key (not `json_body:`), use lowercase
  term values to match tokenizer behavior
- Update tests to run against both quickwit and elasticsearch engines

Two-level filtering now implemented:
1. Metastore level: tags + time range from query AST
2. Split level: lightweight query execution for accurate filtering

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: ruo <ruoliu.dev@gmail.com>

* refactor(es-compat): use best-effort metadata filtering for index_filter

Remove heavy split-level query execution for field_caps index_filter.
The implementation now aligns with ES's "best-effort" approach that uses
metadata-level filtering only (time range, tags) instead of opening
splits and executing queries.

Changes:
- Remove split_matches_query function (no longer opens splits)
- Remove query_ast and doc_mapper from LeafListFieldsRequest proto
- Keep metadata-level filtering in root_list_fields:
  - Time range extraction from query AST
  - Tag-based split pruning
- Simplify leaf_list_fields to just return fields from all splits

This matches ES semantics: "filtering is done on a best-effort basis...
this API may return an index even if the provided filter matches no
document."

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: ruo <ruoliu.dev@gmail.com>

* fix(es-compat): reject empty index_filter to match ES behavior

- Remove empty object {} handling in parse_index_filter_to_query_ast
- ES rejects empty index_filter with 400, now QW does too
- Add tag_fields config and tag-based index_filter test
- Update unit and integration tests accordingly

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: ruo <ruoliu.dev@gmail.com>

* Added ref doc for the new functionality.

* cargo fmt fix

---------

Signed-off-by: ruo <ruoliu.dev@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: fulmicoton <paul.masurel@datadoghq.com>
* Gossip ingester status

* Update ingester pool when status changes

* Rebalance shards when IngesterStatus changes

* Fix timeout_after being 0, causing to not wait for ingester status propagation

* Also refresh the ingester pool when an ingester status has changed

* Add integration test

* Make setup_ingester_pool and setup_indexer_pool a bit more uniform

* make fix

* Instrument rebalance_shards calls

* Unified ClusterNode::for_test_with_ingester_status into ClusterNode::for_test

* Remove duplicated readiness check on ingester status

* Refactor the ClusterSandbox to add the possibility to dynamically add a node to a running cluster

* Ensure the shard is created on the indexer that we shutdown in test_graceful_shutdown_no_data_loss integration test

* Don't trigger a rebalance for Add(ready, IngesterStatus::Initializing); improve comments and tests for it

* Don't refresh the indexer pool when an indexer update its ingester status

* The ingester_pool should contains all ingesters, not only the ready ones

* Debug message when no shards to rebalance

---------

Co-authored-by: Adrien Guillo <adrien.guillo@datadoghq.com>
* Implement IngesterCapacityScore broadcast (quickwit-oss#6152)

* Implement node based routing table (quickwit-oss#6159)

* Use new node based routing table for routing decisions (quickwit-oss#6163)

* Piggyback routing update on persist response (quickwit-oss#6173)

* Remove unused shard_ids in persist protos (quickwit-oss#6169)

* Add availability zone awareness to node based routing (quickwit-oss#6189)

* Remove old routing table; Take both disk and memory WAL readings (quickwit-oss#6193)

* Add az-aware ingest attempts metric (quickwit-oss#6194)
… map text to keyword in _mapping (quickwit-oss#6208)

* feat(es-compat): support regexp shorthand format, expose concatenate fields, and map text to keyword in _mapping

Elasticsearch's `regexp` query accepts two formats:
- Shorthand: `{"regexp": {"field": "pattern"}}`
- Full: `{"regexp": {"field": {"value": "pattern", "case_insensitive": true}}}`

Quickwit only supported the full form, causing queries from ES-compatible
connectors (e.g. Trino ES connector) to fail with a deserialization error.
This adds support for the shorthand format via `#[serde(untagged)]` enum
deserialization.

Additionally, in the `_mapping` endpoint:
- `Text` fields are now reported as `keyword` type. This enables filter
  pushdown (e.g. `LIKE` predicates) from connectors that only push down
  filters for `keyword`-typed fields.
- `Concatenate` fields are now exposed as `keyword` type instead of being
  hidden. This allows connectors to discover and query these fields.

Made-with: Cursor

* fix(es-compat): replace manual Default impl with derive to fix clippy lint

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* address PR review: inline custom Deserialize for RegexQueryParams, add keyword comment

- Replace inner enum + serde(from) with a custom Deserialize impl directly
  on RegexQueryParams, as suggested by reviewer
- Add comment explaining why text fields are mapped to keyword in the
  ES-compat _mapping response

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: simplify RegexQueryParams to a plain untagged enum

Replace the custom Deserialize visitor with a simple #[serde(untagged)]
enum that handles both shorthand and full regexp query formats directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Bump lz4_flex from 0.11.5 to 0.11.6 in /quickwit

Bumps [lz4_flex](https://github.com/pseitz/lz4_flex) from 0.11.5 to 0.11.6.
- [Release notes](https://github.com/pseitz/lz4_flex/releases)
- [Changelog](https://github.com/PSeitz/lz4_flex/blob/main/CHANGELOG.md)
- [Commits](PSeitz/lz4_flex@0.11.5...0.11.6)

---
updated-dependencies:
- dependency-name: lz4_flex
  dependency-version: 0.11.6
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Pascal Seitz <pascal.seitz@datadoghq.com>

* run license tool

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Pascal Seitz <pascal.seitz@datadoghq.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Pascal Seitz <pascal.seitz@datadoghq.com>
* Fix index reincarnation bug

* Use chitchat TTL instead of manual deletion

* fix tests

* PR comments

* fix vec issue
…ckwit-oss#6228)

Bumps the github-actions group with 13 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| [actions/checkout](https://github.com/actions/checkout) | `6.0.1` | `6.0.2` |
| [actions/setup-python](https://github.com/actions/setup-python) | `6.1.0` | `6.2.0` |
| [dtolnay/rust-toolchain](https://github.com/dtolnay/rust-toolchain) | `f7ccc83f9ed1e5b9c81d8a67d7ad1a747e22a561` | `efa25f7f19611383d5b0ccf2d1c8914531636bf9` |
| [taiki-e/cache-cargo-install-action](https://github.com/taiki-e/cache-cargo-install-action) | `3.0.1` | `3.0.3` |
| [actions/dependency-review-action](https://github.com/actions/dependency-review-action) | `98884d411b0f1c583e5ee579e7e897d4623019c2` | `f5b971718edcbb31275a1db40004592335c0e031` |
| [docker/login-action](https://github.com/docker/login-action) | `3.6.0` | `4.0.0` |
| [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action) | `3.7.0` | `4.0.0` |
| [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) | `3.12.0` | `4.0.0` |
| [docker/metadata-action](https://github.com/docker/metadata-action) | `5.10.0` | `6.0.0` |
| [docker/build-push-action](https://github.com/docker/build-push-action) | `6.18.0` | `7.0.0` |
| [actions/upload-artifact](https://github.com/actions/upload-artifact) | `6.0.0` | `7.0.0` |
| [actions/download-artifact](https://github.com/actions/download-artifact) | `7.0.0` | `8.0.1` |
| [actions/setup-node](https://github.com/actions/setup-node) | `6.1.0` | `6.3.0` |



Updates `actions/checkout` from 6.0.1 to 6.0.2
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@8e8c483...de0fac2)

Updates `actions/setup-python` from 6.1.0 to 6.2.0
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](actions/setup-python@83679a8...a309ff8)

Updates `dtolnay/rust-toolchain` from f7ccc83f9ed1e5b9c81d8a67d7ad1a747e22a561 to efa25f7f19611383d5b0ccf2d1c8914531636bf9
- [Release notes](https://github.com/dtolnay/rust-toolchain/releases)
- [Commits](dtolnay/rust-toolchain@f7ccc83...efa25f7)

Updates `taiki-e/cache-cargo-install-action` from 3.0.1 to 3.0.3
- [Release notes](https://github.com/taiki-e/cache-cargo-install-action/releases)
- [Changelog](https://github.com/taiki-e/cache-cargo-install-action/blob/main/CHANGELOG.md)
- [Commits](taiki-e/cache-cargo-install-action@34ce512...59027eb)

Updates `actions/dependency-review-action` from 98884d411b0f1c583e5ee579e7e897d4623019c2 to f5b971718edcbb31275a1db40004592335c0e031
- [Release notes](https://github.com/actions/dependency-review-action/releases)
- [Commits](actions/dependency-review-action@98884d4...f5b9717)

Updates `docker/login-action` from 3.6.0 to 4.0.0
- [Release notes](https://github.com/docker/login-action/releases)
- [Commits](docker/login-action@5e57cd1...b45d80f)

Updates `docker/setup-qemu-action` from 3.7.0 to 4.0.0
- [Release notes](https://github.com/docker/setup-qemu-action/releases)
- [Commits](docker/setup-qemu-action@c7c5346...ce36039)

Updates `docker/setup-buildx-action` from 3.12.0 to 4.0.0
- [Release notes](https://github.com/docker/setup-buildx-action/releases)
- [Commits](docker/setup-buildx-action@8d2750c...4d04d5d)

Updates `docker/metadata-action` from 5.10.0 to 6.0.0
- [Release notes](https://github.com/docker/metadata-action/releases)
- [Commits](docker/metadata-action@c299e40...030e881)

Updates `docker/build-push-action` from 6.18.0 to 7.0.0
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](docker/build-push-action@2634353...d08e5c3)

Updates `actions/upload-artifact` from 6.0.0 to 7.0.0
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@b7c566a...bbbca2d)

Updates `actions/download-artifact` from 7.0.0 to 8.0.1
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](actions/download-artifact@37930b1...3e5f45b)

Updates `actions/setup-node` from 6.1.0 to 6.3.0
- [Release notes](https://github.com/actions/setup-node/releases)
- [Commits](actions/setup-node@395ad32...53b8394)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 6.0.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
- dependency-name: actions/setup-python
  dependency-version: 6.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
- dependency-name: dtolnay/rust-toolchain
  dependency-version: efa25f7f19611383d5b0ccf2d1c8914531636bf9
  dependency-type: direct:production
  dependency-group: github-actions
- dependency-name: taiki-e/cache-cargo-install-action
  dependency-version: 3.0.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
- dependency-name: actions/dependency-review-action
  dependency-version: f5b971718edcbb31275a1db40004592335c0e031
  dependency-type: direct:production
  dependency-group: github-actions
- dependency-name: docker/login-action
  dependency-version: 4.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: docker/setup-qemu-action
  dependency-version: 4.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: docker/setup-buildx-action
  dependency-version: 4.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: docker/metadata-action
  dependency-version: 6.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: docker/build-push-action
  dependency-version: 7.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: actions/upload-artifact
  dependency-version: 7.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: actions/download-artifact
  dependency-version: 8.0.1
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: actions/setup-node
  dependency-version: 6.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Add composite aggregation

Rebase of quickwit-oss#5957 by @rdettai-sk onto main, updated to use
the latest tantivy revision (545169c0d8) which includes
composite aggregation support upstream.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test fix

---------

Co-authored-by: Remi Dettai <remi.dettai@sekoia.io>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Paul Masurel <paul@quickwit.io>
)

Bumps [quinn-proto](https://github.com/quinn-rs/quinn) from 0.11.13 to 0.11.14.
- [Release notes](https://github.com/quinn-rs/quinn/releases)
- [Commits](quinn-rs/quinn@quinn-proto-0.11.13...quinn-proto-0.11.14)

---
updated-dependencies:
- dependency-name: quinn-proto
  dependency-version: 0.11.14
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…it-oss#6229)

Bumps the npm_and_yarn group with 6 updates in the /quickwit/quickwit-ui directory:

| Package | From | To |
| --- | --- | --- |
| @isaacs/brace-expansion | `5.0.0` | `5.0.1` |
| [axios](https://github.com/axios/axios) | `1.13.2` | `1.13.6` |
| [immutable](https://github.com/immutable-js/immutable-js) | `3.8.2` | `3.8.3` |
| [lodash](https://github.com/lodash/lodash) | `4.17.21` | `4.17.23` |
| [rollup](https://github.com/rollup/rollup) | `4.53.2` | `4.60.0` |
| [yaml](https://github.com/eemeli/yaml) | `1.10.2` | `1.10.3` |



Updates `@isaacs/brace-expansion` from 5.0.0 to 5.0.1

Updates `axios` from 1.13.2 to 1.13.6
- [Release notes](https://github.com/axios/axios/releases)
- [Changelog](https://github.com/axios/axios/blob/v1.x/CHANGELOG.md)
- [Commits](axios/axios@v1.13.2...v1.13.6)

Updates `immutable` from 3.8.2 to 3.8.3
- [Release notes](https://github.com/immutable-js/immutable-js/releases)
- [Changelog](https://github.com/immutable-js/immutable-js/blob/main/CHANGELOG.md)
- [Commits](immutable-js/immutable-js@v3.8.2...v3.8.3)

Updates `lodash` from 4.17.21 to 4.17.23
- [Release notes](https://github.com/lodash/lodash/releases)
- [Commits](lodash/lodash@4.17.21...4.17.23)

Updates `rollup` from 4.53.2 to 4.60.0
- [Release notes](https://github.com/rollup/rollup/releases)
- [Changelog](https://github.com/rollup/rollup/blob/master/CHANGELOG.md)
- [Commits](rollup/rollup@v4.53.2...v4.60.0)

Updates `yaml` from 1.10.2 to 1.10.3
- [Release notes](https://github.com/eemeli/yaml/releases)
- [Commits](eemeli/yaml@v1.10.2...v1.10.3)

---
updated-dependencies:
- dependency-name: "@isaacs/brace-expansion"
  dependency-version: 5.0.1
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: axios
  dependency-version: 1.13.6
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: immutable
  dependency-version: 3.8.3
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: lodash
  dependency-version: 4.17.23
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: rollup
  dependency-version: 4.60.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: yaml
  dependency-version: 1.10.3
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* PR1: adding quickwit-parquet-engine-crate, defining parquet schema, parquet writer functions

* PR2: add ingest and indexing logic to quick-parquet-engine

* PR3: Add proto definitions for metrics ingest and split management

* PR4: Metrics ingestion - OTel metrics parsing and Arrow conversion

* PR5: Add metastore support for metrics

* PR6: Generalize Source trait and SourceActor over Processor type

* PR7: Add parquet indexing pipeline actors and wiring

* minimal cargo lock and toml

* cargo clippy

* rustfmt, cargo machete

* fix unit tests

* fix accidental rename of index_id --> index_uid in build_index_id_patterns_sql_query

* old test was asserting buggy sql

* address codex comments

* address pr1 comments

* minor change

* remove datadog specific stuff, remove otlp metrics endpoint

* license, fix tests

* license check

* actually fix license, formatting changes, remove expectation of otel metrics index at startup

* use SourceActor name vs {}Source

* CI checks

---------

Co-authored-by: fulmicoton-dd <paul.masurel@datadoghq.com>
Resolves conflicts preserving indextables custom patches:
- Keep SearchPermitProvider Sync mode (Java threading support)
- Keep SearchPermitFuture enum-based approach (Sync/Async variants)
- Keep leaf_search_single_split as pub with SplitOverrides parameter
- Keep open_split_bundle/open_index_with_caches as pub + #[instrument]
- Keep leaf_cache as pub mod
- Keep FieldPresenceQuery all-fast-fields optimization (fast_null_support)
- Take upstream runtimes.rs disable_lifo_slot (default true)
- Take upstream invoker module addition to quickwit-search lib.rs
- Take upstream SearchPermitMessage::RequestWithOffload + offload logic
- Regenerate Cargo.lock from our dependency set
…g compile errors

- Update tantivy dependency to indextables/tantivy @ ffe408f49 (upgrade/tantivy-0.26)
- Fix merge_executor.rs: pass parsed_query_ast by value and add None cache_context arg
- Fix search_permit_provider.rs: SearchPermitFuture::from_async_receiver, remove Debug derives

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lity

LeafSearchContext uses pub(crate) IncrementalCollector and is itself private,
making it impossible to construct from outside the crate. Revert to flat
parameter style so tantivy4java can call the function directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@schenksj
Copy link
Copy Markdown
Collaborator Author

Code Review

Overall this is a solid merge-up. The conflict resolutions are reasonable and the new features (parquet companion mode overlay, sync permit provider, session-token S3 support, byte-range cache capacity) are well-motivated. A few issues are worth addressing before merge.


Bugs

Double-decrement in SyncSearchPermit / SearchPermit drop (search_permit_provider.rs)

SearchPermit::drop already decrements current_permits / current_memory when in Sync mode, but then the SyncSearchPermit field inside SearchPermitMode2::Sync is also dropped, running SyncSearchPermit::drop — which does the same decrement again. With saturating_sub this silently goes to zero rather than panicking, but any subsequent permits will see incorrect state.

Fix: remove the Drop impl from SyncSearchPermit (let SearchPermit::drop own it), or remove the manual decrement from SearchPermit::drop's Sync arm.

num_download_slots is tracked but never enforced in sync mode

SyncPermitState::num_download_slots is stored and new_sync(num_download_slots, …) accepts it, but acquire_permits_and_wait in the Sync branch never checks it — all permits are immediately ready regardless of concurrency. Either enforce the limit or remove the field to avoid the false impression of back-pressure.

update_memory_usage silently leaves stale allocation in sync mode

// Note: We don't update sync_permit.memory_allocation since it's not mutable

sync_permit.memory_allocation is used in Drop to decrement current_memory. If update_memory_usage updates state.current_memory but leaves memory_allocation unchanged, the Drop will subtract the old allocation, producing a wrong (typically inflated) current_memory. Make memory_allocation mutable or track the delta separately.


Correctness Concerns

TOCTOU in ByteRangeCache::put_slice capacity enforcement (byte_range_cache.rs)

let current_bytes = self.get_num_bytes(); // reads AtomicU64
let new_total = current_bytes + range_size as u64;
if new_total > capacity {
    // lock, clear, unlock
}
// re-lock, put

Between the capacity check and the put, another thread can add data. Under concurrent prewarm the cache can grow beyond capacity before the clear fires. For correctness the check-and-clear should hold the need_mut_byte_range_cache lock for the full duration of the put, or at least re-check num_stored_bytes inside the lock before clearing.

Commented-out disable_lifo_slot in runtimes.rs

//if disable_lifo_slot {
//    blocking_runtime_builder.disable_lifo_slot();
//}

This silently reverts the upstream default (disable_lifo_slot = true / QW_DISABLE_TOKIO_LIFO_SLOT) without explanation. If this was intentional for the fork, leave a comment explaining why. If it was a debugging leftover, restore it.

SplitSearchOutcomeCounters::new_unregistered() per split in leaf_search_single_split

The flat-params refactor creates a fresh new_unregistered() counter for every split invocation. The original used ctx.split_outcome_counters from LeafSearchContext, which was registered with Prometheus. The new counters are thrown away on return, so split-level search outcome metrics (cache hits, errors, etc.) are no longer emitted in the external-call path. If this code path is used from tantivy4java, that's a silent metrics regression.


Debug Instrumentation in Production Code

The PR introduces substantial tantivy4java_debug! logging directly into hot paths:

  • s3_compatible_storage.rs: every get_slice, get_all, file_num_bytes, and key() call emits debug lines including large-request classification and pointer values.
  • byte_range_cache.rs: every cache hit, miss, and put emits debug lines with emoji.
  • leaf.rs: a local debug_println! macro duplicates quickwit_common::tantivy4java_debug!.

While the macro gates on an env-var check, the string formatting (especially format! with emoji) still has measurable overhead in tight loops. For a library shipped as a dependency of tantivy4java, it would be cleaner to:

  1. Use tracing::debug! spans (already used elsewhere in this file) which are zero-cost when the subscriber is not installed.
  2. Or gate the expensive formatting inside the macro itself with if enabled { ... }.

The commented-out code blocks (stack trace, thread_id, match arms) should also be removed before merge.


Minor / Nits

  • OBJECT_STORAGE_REQUEST_COUNT reset race: reset_object_storage_request_stats() is a global reset. If multiple concurrent searches are in-flight, a reset mid-request produces meaningless counts. Document that callers must ensure quiescence, or scope the counters per-operation rather than globally.

  • #[cfg(not(tokio_unstable))] fallback in runtimes.rs: The fallback branch spawns a task that creates _prometheus_runtime_metrics and _interval with leading underscores (never used) and logs a warn! once, then exits. This is noise — just skip the spawn entirely when tokio_unstable is not set, or log the warning once at startup outside of spawn.

  • Blank lines: Several extra blank lines were added inside get_split_footer_from_cache_or_fetch and open_split_bundle (before/after single-line statements). Minor readability nit.

  • pub visibility of leaf_search_single_split with Option<SplitOverrides>: Exposing this as pub with a new overrides parameter is fine for tantivy4java's use case, but callers inside the crate (leaf_search_single_split_wrapper) now pass None explicitly. Consider whether pub(crate) is sufficient for the internal callers and pub only needed for the external entry-point.


What Looks Good

  • field_presence.rs optimization: The "all fast fields → use ExistsQuery directly" path is correct and falls through cleanly to _field_presence for mixed schemas.
  • OverlayDirectory: Clean implementation; the read-only no-op write/delete methods follow the same pattern as HotDirectory. The UnionDirectory layering is the right approach.
  • Session-token S3 support: The get_credentials_provider pattern-match change and new constructor are clean. New tests cover the edge cases well.
  • tokio_metrics replacement in runtimes.rs: Moving from the tokio_metrics crate to tokio::runtime::Handle::metrics() (gated on tokio_unstable) is the right direction.
  • open_index_with_caches wrapper: Keeping the original signature as a thin wrapper over open_index_with_caches_and_schema_override(…, None) preserves backward compatibility cleanly.

🤖 Generated with Claude Code

- Add Apache-2.0 license header to quickwit-common/src/debug.rs
- Prefix disable_lifo_slot with _ to suppress unused variable warning (-Dwarnings)
- Add tokio-metrics to cargo-machete ignore list in quickwit-common
- Run cargo fmt --all to fix rustfmt diffs across workspace

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@schenksj schenksj merged commit bcf79be into main Mar 30, 2026
1 of 7 checks passed
@schenksj schenksj deleted the upgrade/quickwit-upstream-main branch March 30, 2026 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.