-
Notifications
You must be signed in to change notification settings - Fork 289
[MOD-6057] Improve and elaborate FT.PROFILE documentation
#2532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Added detailed notes on vector iterator profiles, including vector search modes and batch execution statistics.
|
Hi @meiravgri. Please sign the CLA. I'll review today. |
FT.PROFILE documentaion
FT.PROFILE documentaionFT.PROFILE documentation
should we remove vector filter?
| | `Threadsafe-Loader` | The `Threadsafe-Loader` processor safely loads document contents when the query is running in a background thread. It acquires the GIL to access document data. Reports an additional `GIL-Time` field showing how long (ms) it held the GIL. | | ||
| | `Highlighter` | The `Highlighter` processor is used to highlight matching terms in the search results. This is especially useful for full-text search applications, where relevant terms are often emphasized in the UI. | | ||
| | `Paginator` | The `Paginator` processor is responsible for handling pagination by limiting the results to a specific range (e.g., LIMIT 0 10).It trims down the set of results to fit the required pagination window, ensuring efficient memory usage when dealing with large result sets. | | ||
| | `Vector` `Filter` | For vector searches, the `Vector Filter` processor is sometimes used to pre-process results based on vector similarity thresholds before the main scoring and sorting. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dwdougherty Hi!
I noticed Vector Filter was introduced in #819, but I couldn't find any reference to it in the RediSearch source code, neither in the current version nor in older versions (checked back to 2.10).
Can you confirm this entry can be removed, or was it referring to something else?
alonre24
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
| | `Total` `profile` `time` | The total run time (ms) of the query. Normally just a few ms. | | ||
| | `Parsing` `time` | The time (ms) spent parsing the query and its parameters into a query plan. Normally just a few ms. | | ||
| | `Pipeline` `creation` `time` | The creation time (ms) of the execution plan, including iterators, result processors, and reducers creation. Normally just a few ms for `FT.SEARCH` queries, but expect a larger number for `FT.AGGREGATE` queries. | | ||
| | `Total` `GIL` `time` | The total time (ms) the query held the Global Interpreter Lock (GIL) during execution. Relevant for multi-threaded deployments where queries run in background threads. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lerman25, Evntually, do we expect this field to exist in case of non multithreaded deployment? If so, what should be the value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, if we don't have workers - this field doesn't print (starting from RediSearch/RediSearch#7756)
| | `Total` `profile` `time` | The total run time (ms) of the query. Normally just a few ms. | | ||
| | `Parsing` `time` | The time (ms) spent parsing the query and its parameters into a query plan. Normally just a few ms. | | ||
| | `Pipeline` `creation` `time` | The creation time (ms) of the execution plan, including iterators, result processors, and reducers creation. Normally just a few ms for `FT.SEARCH` queries, but expect a larger number for `FT.AGGREGATE` queries. | | ||
| | `Total` `GIL` `time` | The total time (ms) the query held the Global Interpreter Lock (GIL) during execution. Relevant for multi-threaded deployments where queries run in background threads. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's refer it as the Global Redis Lock, and explain when we are expected to hold it (upon LOADing non sortable fields). Also, does it include the time we are waiting for the GIL, or only the time we are actually holding it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It include the time waiting for the GIL ( We start the timer, and then call RedisModule_ThreadSafeContextLock)
content/commands/ft.profile.md
Outdated
| | `Pipeline` `creation` `time` | The creation time (ms) of the execution plan, including iterators, result processors, and reducers creation. Normally just a few ms for `FT.SEARCH` queries, but expect a larger number for `FT.AGGREGATE` queries. | | ||
| | `Total` `GIL` `time` | The total time (ms) the query held the Global Interpreter Lock (GIL) during execution. Relevant for multi-threaded deployments where queries run in background threads. | | ||
| | `Warning` | Errors that occurred during query execution. | | ||
| | `Internal` `cursor` `reads` | The number of internal cursor read operations performed during a distributed `AGGREGATE` query in cluster mode. In cluster mode, the coordinator uses cursors to fetch results from shards - this counts the initial request plus any subsequent `FT.CURSOR READ`. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider rephrasing instead of using ft.cursor read, so it will not mix up with user cursors. Maybe something like "... any subsequent batch of results featching from the shard"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| * `EMPTY` | ||
| * `WILDCARD` | ||
| * `OPTIONAL` | ||
| * `OPTIMIZER` with `Optimizer mode` - Enabled by default in dialect 4+, or explicitly with `WITHOUTCOUNT`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true for both ft.search and ft.aggregate? cause ft.aggregate is without count by default AFAIK
MOD-12263 vector iterator details: Added detailed notes on vector iterator profiles, including vector search modes and batch execution statistics.
MOD-6056 rename “counter” and “size”
MOD-12816 update ResultProcessor GILTime
MOD-12414 Add internal cursor reads to profile documentation