fix: resolve JSON parsing issues and implement RRF fusion #24
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary of Changes
I have implemented several fixes to improve the backend's stability and search result ranking.
Key Updates:
Robust JSON Parsing: Added a clean_and_parse_json utility to handle cases where the LLM returns markdown formatting or extra text. This prevents the system from crashing during keyword and intent extraction.
Search Ranking (RRF): Integrated Reciprocal Rank Fusion (RRF) logic to merge results from Knowledge Space and Vector searches. This ensures that the most relevant datasets from both sources are prioritized in the final response.
Lightweight Testing (Torch Bypass): Added an is_enabled flag and commented out the Retriever import in VectorSearchAgent. This allows for local testing and API validation without needing to download heavy torch dependencies.
Intent Handling: Refined the logic to better distinguish between general greetings and actual data discovery queries, preventing unnecessary search triggers.
Verification: The backend has been verified using the FastAPI Swagger UI. The API returns a 200 OK status, and the fusion logic correctly processes search results even with the vector bypass active.
Closes #8