fix(rg): avoid eager match vector allocation#1742
Merged
Merged
Conversation
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
bashkit | 06df616 | Commit Preview URL | May 25 2026, 04:43 PM |
A recent rg matcher change collected all matches into a Vec, allowing attacker-controlled dense matches to force large transient allocations and cause host memory DoS even for count-only operations. Replace the eager find_iter()→Vec design with streaming RgMatcher::for_each_match callback iteration and add a lazy RgMatcher::count_matches helper. Update count-sensitive paths (--count-matches, JSON match aggregation, replace_all pre-flight) to use count_matches, and update output/formatting paths (color/highlight, vimgrep, only-matching, multiline match collection, JSON submatch construction) to stream matches via for_each_match instead of pre-materializing per-line vectors. Replacement and single-match helpers (find, replace_all, replace_first) remain unchanged. Rebased on current main; original PR #1742 by chaliy.
66b88a1 to
06df616
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
rgmatcher change collected all matches into aVec<RgMatch>, allowing attacker-controlled dense matches to force large transient allocations and cause host memory DoS for even-count-only operations.Description
find_iter-returning-Vec<RgMatch>design with a streaming callbackRgMatcher::for_each_matchand addRgMatcher::count_matchesfor lazy counting.regex.count_matches(...)instead of.find_iter(...).len()for--count-matchesand JSON match aggregation.vimgrep,only-matching, multiline match collection, JSON submatch construction, etc.) to stream matches viafor_each_matchinstead of pre-materializing per-line vectors.find,replace_all,replace_first) unchanged.Testing
cargo test -p bashkit rg --lib; most tests passed but one differential test failed:builtins::rg::tests::diff_rg_matches_real_rg_cases :: colors custom highlight fg(overall: 173 passed, 1 failed).rgunit suite and confirmed the change removes eager Vec allocations in count and output paths while leaving behavior equivalent for the majority of cases.Codex Task