ref(search): Prefer substring and prefix matches in fzf scoring#111050
ref(search): Prefer substring and prefix matches in fzf scoring#111050
Conversation
The fzf v1 algorithm scores matches purely by word-boundary bonuses and gap penalties. This causes a scattered subsequence match that happens to hit multiple word boundaries (e.g. "s" at string start, "co" after a dot) to outscore a true contiguous substring match that starts mid-word. In practice this means searching "sco" in a member list ranks "sentry.connect.ops@sentry.io" (scattered) above "discovery.channel@sentry.io" or "francesco.novy@sentry.io" (both contain "sco" as a substring). Two post-score bonuses are added to establish a clear preference hierarchy: - Substring bonus (+24): applied when matches.length === 1, i.e. all pattern characters are contiguous. Ensures any substring match beats a scattered boundary match. - Prefix bonus (+8): applied additionally when sidx === 0. Distinguishes "scott.morrison" (match at string start) from "aaron.scotton" (match starts a word component mid-string) — both previously tied at the same boundary bonus. Resulting tiers for pattern "sco": prefix substring (scott.morrison) 112 boundary substring (aaron.scotton) 104 mid-word substring (discovery.channel) 80 scattered (sentry.connect.ops) 72 Co-Authored-By: Claude <noreply@anthropic.com>
|
|
||
| it('ranks globex.io emails above others when searching "globex"', () => { | ||
| const results = search('globex'); | ||
| const globexEmails = results.filter(r => r.email.includes('globex.io')); |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High
Copilot Autofix
AI about 1 month ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
| it('ranks globex.io emails above others when searching "globex"', () => { | ||
| const results = search('globex'); | ||
| const globexEmails = results.filter(r => r.email.includes('globex.io')); | ||
| const nonGlobexEmails = results.filter(r => !r.email.includes('globex.io')); |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High
Copilot Autofix
AI about 1 month ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
The fzf v1 algorithm scores matches purely using word-boundary bonuses and gap
penalties. This causes a scattered subsequence match that happens to hit multiple
word boundaries to outscore a true contiguous substring match that starts mid-word.
In practice: searching
"sco"in a Sentry member/assignee picker rankedsentry.connect.ops@sentry.io(score 72, scattered —sat string start,cafter a dot,
oconsecutive) abovediscovery.channel@sentry.ioorfrancesco.novy@sentry.io(score 56, both contain"sco"as a real substring).Changes
Two post-score bonuses are added after the existing exact-match boost:
Substring bonus (+24) — applied when
matches.length === 1(all patterncharacters form a single contiguous range). Ensures any substring match beats a
scattered boundary match regardless of where in the string it appears.
Prefix bonus (+8) — applied additionally when
sidx === 0. Distinguishes amatch at the very start of the string from the same substring appearing after a
word separator mid-string. Without this,
scott.morrisonandaaron.scottontied at the same score because both
scharacters receive the same boundary bonus.Resulting score tiers (pattern
"sco")"sco"scott.morrison@sentry.ioaaron.scotton@sentry.iodiscovery.channel@sentry.iosentry.connect.ops@sentry.ioTests cover all four tiers with a realistic Sentry org member dataset (labels +
emails) that mirrors the actual search UI, plus existing OTel attribute, full-name,
and email search suites.