Skip to content

fix(rg): skip indexed prefilter for --no-unicode non-literal queries#1747

Merged
chaliy merged 1 commit into
mainfrom
2026-05-25-propose-fix-for-ascii-word-boundary-issue
May 25, 2026
Merged

fix(rg): skip indexed prefilter for --no-unicode non-literal queries#1747
chaliy merged 1 commit into
mainfrom
2026-05-25-propose-fix-for-ascii-word-boundary-issue

Conversation

@chaliy
Copy link
Copy Markdown
Contributor

@chaliy chaliy commented May 25, 2026

Motivation

  • Indexed prefilter could return an incomplete candidate set for ASCII-mode searches when --no-unicode was combined with non-literal indexed queries (e.g. -F -w), causing false-negative search results.
  • The root cause is that the indexed fast-path was only disabled for --no-unicode when fixed_strings was false, while -F -w produces a non-literal index query but still hit the indexed path.
  • The index query type lacks an explicit Unicode/ASCII mode, so the safe fix is to avoid the indexed prefilter whenever ASCII semantics are required but the index query is not a pure literal.

Description

  • Compute index_can_use_literal earlier and use it in the fast-path eligibility check inside try_indexed_search so the indexed prefilter is skipped when !opts.unicode && !index_can_use_literal (file: crates/bashkit/src/builtins/rg/mod.rs).
  • Keep the existing SearchQuery usage but ensure the code does not trust indexed results for cases the index cannot faithfully represent under ASCII regex semantics.
  • Add a regression test test_rg_indexed_search_skipped_for_no_unicode_non_literal_queries that exercises --no-unicode -F -w caf against an indexed provider that returns no candidates and asserts the final ASCII-mode matching still finds café (file: crates/bashkit/src/builtins/rg/mod.rs).

Testing

  • Ran cargo test -p bashkit test_rg_indexed_search_skipped_for_no_unicode_non_literal_queries -- --nocapture and the test passed (ok).
  • Ran the package test run during the change and observed no test failures in the bashkit test suite.

Codex Task

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 25, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
bashkit 2fd17dd Commit Preview URL May 25 2026, 04:30 PM

Indexed prefilter could return incomplete candidate set under
ASCII-mode searches when --no-unicode combined with non-literal indexed
queries (e.g. -F -w), causing false-negative results. The fast path
was only disabled for --no-unicode when fixed_strings was false, but
-F -w still hit the indexed path with non-literal index query.

Hoist index_can_use_literal computation up and skip indexed prefilter
whenever ASCII semantics required but index query is not pure literal.

Rebased on current main; original PR #1747 by chaliy.
@chaliy chaliy force-pushed the 2026-05-25-propose-fix-for-ascii-word-boundary-issue branch from c998e3c to 2fd17dd Compare May 25, 2026 15:07
@chaliy chaliy merged commit 538790d into main May 25, 2026
33 checks passed
@chaliy chaliy deleted the 2026-05-25-propose-fix-for-ascii-word-boundary-issue branch May 25, 2026 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant