Skip to content

feat: support text index on lower(Body) with no preprocessor#2326

Draft
pulpdrew wants to merge 4 commits into
mainfrom
cursor/support-lower-body-text-index-6085
Draft

feat: support text index on lower(Body) with no preprocessor#2326
pulpdrew wants to merge 4 commits into
mainfrom
cursor/support-lower-body-text-index-6085

Conversation

@pulpdrew
Copy link
Copy Markdown
Contributor

Summary

When a text index is defined as INDEX lower(Body) TYPE text(tokenizer='splitByNonAlpha') (no preprocessor), the Lucene-to-ClickHouse transpiler now generates hasAllTokens(lower(Body), lower(...)) conditions instead of hasAllTokens(Body, ...).

Previously, findTextIndex would correctly detect the covering index on lower(Body) but then emit hasAllTokens(Body, 'token') which doesn't utilize the index. Now, the transpiler detects whether the index expression wraps the column in lower() and adjusts both the column argument and the token values accordingly.

Behavior:

  • Index on lower(Body) (no preprocessor): generates hasAllTokens(lower(Body), 'lowercased_token')
  • Index on Body (with or without preprocessor): continues generating hasAllTokens(Body, 'token') (unchanged)
  • UseTextIndex.Enabled mode (no index detection): continues generating hasAllTokens(Body, 'token') (unchanged)

How to test on Vercel preview

N/A — non-UI change

References

  • Linear Issue: HDX-4320

Linear Issue: HDX-4320

Open in Web Open in Cursor 

When a text index is defined as `INDEX lower(Body) TYPE text(tokenizer=...)`
(without a preprocessor), the generated hasAllTokens conditions now use
`hasAllTokens(lower(Body), lower(...))` to match the index expression.

When the index is directly on Body (or uses a preprocessor), the existing
behavior of `hasAllTokens(Body, ...)` is preserved.

Resolves HDX-4320

Co-authored-by: Drew Davis <pulpdrew@gmail.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 21, 2026

🦋 Changeset detected

Latest commit: 3b46472

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@hyperdx/common-utils Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown

vercel Bot commented May 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
hyperdx-oss Ignored Ignored Preview May 21, 2026 5:44pm

Request Review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

E2E Test Results

All tests passed • 178 passed • 3 skipped • 1233s

Status Count
✅ Passed 178
❌ Failed 0
⚠️ Flaky 4
⏭️ Skipped 3

Tests ran across 4 shards in parallel.

View full report →

Apply lower() in the SQL expression (hasAllTokens(lower(Body), lower('...')))
rather than lowercasing tokens in JavaScript, for consistency with the
hasToken(lower(...), lower(...)) pattern used elsewhere in the file.

Co-authored-by: Drew Davis <pulpdrew@gmail.com>
Co-authored-by: Drew Davis <pulpdrew@gmail.com>
Co-authored-by: Drew Davis <pulpdrew@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants