Skip to content

Conversation

@SPatil2026
Copy link
Collaborator

This PR transitions the form-filling engine from legacy heuristics to a Context-Aware Smart Batch Extraction system powered by the Local LLM. It significantly enhances accuracy, enables natural language corrections, and guarantees robust performance across various environments.

Key Changes

  1. Smart Batch Extraction
    Context-Aware: Replaces regex extraction with extract_all_fields, which maps user speech to the complete form schema (Fields, Types, Options) in a single pass.
    Multi-Field Support: Users can now fill multiple fields simultaneously (e.g., "My name is Sam, and my email is sam@test.com").
    Pollution Control: Implements strict parser logic to discard hallucinated fields or extraneous text.

  2. Robust Local LLM Service
    Smart Loading Strategy: Uses auto-fallbacks for hardware compatibility: 4-bit (bitsandbytes) → FP16 (GPU) → CPU.
    Setup Scripts: Fixes download_models.py with dependency checks and adds setup_env.bat for easier Windows setup.

  3. Data Refinement & UX
    Smart Updates: Users can now naturally correct existing values (e.g., changing a phone number); the Agent refers to the master schema instead of just "remaining" fields.
    Strict Formatting: ValueRefiner enforces strict rules (e.g., removing spaces from Phone, lowercasing Email) on all inputs.
    Option Matching: Improves dropdown logic to support text labels and fuzzy matching (e.g., auto-correcting "Alumina" to "Alumni").

@atharvak-dev atharvak-dev merged commit aad09fd into main Jan 16, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants