You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Results are deduplicated by `service_id` (best score per service).
209
+
Results are deduplicated by `service_id` (best score per service), returning up to `DENSE_SEARCH_TOP_K` (3) unique services.
195
210
196
211
**Why not use RRF scores?**
197
212
Qdrant's RRF uses `1/(1+rank)`, producing fixed scores (0.50, 0.33, 0.25) regardless of actual relevance. A perfect match and a random query both get 0.50 for rank 1. Cosine similarity reflects true semantic closeness.
@@ -203,6 +218,7 @@ Sparse prefetch is only included if the query produces a non-empty sparse vector
203
218
204
219
```python
205
220
# classifier.py → _hybrid_search()
221
+
# First checks collection exists and has data (points_count > 0)
206
222
POST/collections/intent_collections/points/query
207
223
{
208
224
"prefetch": [
@@ -215,6 +231,10 @@ POST /collections/intent_collections/points/query
215
231
}
216
232
```
217
233
234
+
> **Note:** Prefetch limit is `HYBRID_SEARCH_TOP_K * 2` (5 * 2 = 10). The sparse prefetch is conditionally added only when `sparse_vector.is_empty()` is False.
235
+
236
+
Hybrid results are also deduplicated by `service_id` (best RRF score per service).
In streaming mode, the service content is wrapped as SSE events and streamed to the client.
371
+
339
372
---
340
373
341
374
## Thresholds & Configuration
@@ -387,7 +420,3 @@ Based on empirical testing with 42 Estonian queries (20 SERVICE, 22 RAG):
387
420
-**Adding more services:** Score distributions improve naturally — service queries score higher, non-service score lower.
388
421
-**Adding more examples per service:** Diverse phrasings expand the embedding coverage. Aim for 5-8 examples per service covering formal + informal + different word orders.
389
422
-**Adjusting thresholds:** Monitor the logs (`Dense search: top=... cosine=...`) and adjust if real-world scores differ from test data.
390
-
391
-
### Current Limitations
392
-
393
-
-**Step 7 (Ruuter service call) is not yet implemented.** The service workflow currently returns a debug response with service metadata (endpoint URL, HTTP method, extracted entities) instead of calling the actual Ruuter service endpoint. See the `TODO: STEP 7` comments in `src/tool_classifier/workflows/service_workflow.py`.
0 commit comments