You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description:
Implement how user queries are processed locally using LLaMA 3. Input comes from chat → backend builds context → model runs locally → response is generated → cleaned → stored → sent back to frontend.
User Story
Given user sends a message When backend receives it Then query should be processed locally using LLaMA 3 and return a contextual response
Description:
Implement how user queries are processed locally using LLaMA 3. Input comes from chat → backend builds context → model runs locally → response is generated → cleaned → stored → sent back to frontend.
User Story
Given user sends a message
When backend receives it
Then query should be processed locally using LLaMA 3 and return a contextual response
Tasks
Local Model Execution Setup
Ensure Local Model is Running
Set Execution Environment
Query Processing Pipeline
Define Input Flow
Create Processing Layer
/app/services/query_processor.pyPrompt Construction
Build Structured Prompt
Context Filtering
Local Inference Execution
Run Model Inference
Control Output Quality
Response Processing
Clean Model Output
Post-Processing Rules
Session Integration
Performance Optimization
Error Handling
Logging & Monitoring
Postman Testing 🧪
/chat/messageendpointFrontend Integration
Acceptance Criteria
Testing Steps
Definition of Done