Local LLM phi-2 #4
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR transitions the form-filling engine from legacy heuristics to a Context-Aware Smart Batch Extraction system powered by the Local LLM. It significantly enhances accuracy, enables natural language corrections, and guarantees robust performance across various environments.
Key Changes
Smart Batch Extraction
Context-Aware: Replaces regex extraction with extract_all_fields, which maps user speech to the complete form schema (Fields, Types, Options) in a single pass.
Multi-Field Support: Users can now fill multiple fields simultaneously (e.g., "My name is Sam, and my email is sam@test.com").
Pollution Control: Implements strict parser logic to discard hallucinated fields or extraneous text.
Robust Local LLM Service
Smart Loading Strategy: Uses auto-fallbacks for hardware compatibility: 4-bit (bitsandbytes) → FP16 (GPU) → CPU.
Setup Scripts: Fixes download_models.py with dependency checks and adds setup_env.bat for easier Windows setup.
Data Refinement & UX
Smart Updates: Users can now naturally correct existing values (e.g., changing a phone number); the Agent refers to the master schema instead of just "remaining" fields.
Strict Formatting: ValueRefiner enforces strict rules (e.g., removing spaces from Phone, lowercasing Email) on all inputs.
Option Matching: Improves dropdown logic to support text labels and fuzzy matching (e.g., auto-correcting "Alumina" to "Alumni").