Skip to content

BE-16: Message Routing to Model Research #16

@tecnodeveloper

Description

@tecnodeveloper

Description:
Research how user messages travel from frontend → backend → LLaMA 3 model → response → frontend. Define clean routing logic so every message is processed correctly, context is preserved, and responses are returned fast and structured.


User Story

Given user sends a message in chat
When backend receives it
Then message should be routed to the model with correct session context and response returned


Tasks


Understand Message Flow

  1. Study Full Chat Pipeline

    • Frontend sends message
    • Backend receives request
    • Session is loaded
    • Context is built
    • Model generates response
    • Response stored + returned
  2. Define Routing Layers

    • Route (API endpoint)
    • Controller (request handler)
    • Service (logic layer)
    • Model layer (LLaMA call)

API Design for Message Routing

  1. Create Chat Endpoint

    • POST /chat/message

    • Accept:

      • user_id
      • session_id
      • message
  2. Validate Request

    • Check empty message
    • Validate session_id
    • Validate user authentication

Session Integration

  1. Fetch Session Context

    • Load session from MongoDB
    • Get last N messages
    • Prepare conversation history
  2. Context Formatting

    • Convert messages into prompt format
    • Maintain role structure (user/assistant)

Model Integration (LLaMA 3)

  1. Connect to Local Model

    • Load local LLaMA 3
    • Create inference function
  2. Send Prompt to Model

    • Pass formatted context
    • Include latest user message
  3. Receive Model Response

    • Capture generated output
    • Clean response text

Routing Logic Design

  1. Controller Layer
  • Receive request
  • Call service layer
  • Return response
  1. Service Layer
  • Handle session fetch
  • Build prompt
  • Call model
  • Save response
  1. Model Layer
  • Only handles inference
  • No business logic

Response Handling

  1. Store Response
  • Save assistant message in MongoDB
  • Update session
  1. Return Response
  • Send JSON response
  • Include message + session_id

Performance Optimization

  1. Reduce Latency
  • Limit context size
  • Use last N messages only
  • Optimize model calls
  1. Async Processing (Optional)
  • Use background tasks
  • Non-blocking API response

Error Handling

  1. Handle Failures
  • Model not responding
  • Session missing
  • Invalid input
  1. Fallback Strategy
  • Return safe error message
  • Log issue

Logging & Debugging

  1. Request Logging
  • Log user message
  • Log session_id
  • Log model response
  1. Error Logging
  • Track failures
  • Save debug info

Postman Testing 🧪

  1. Setup Postman
  • Create POST request /chat/message
  1. Test Message Flow
  • Send sample message
  • Check response
  • Verify session update
  1. Edge Cases
  • Empty message
  • Invalid session
  • Large input

Frontend Integration

  1. Connect Chat UI
  • Send message on button click
  • Show loading state
  1. Render Response
  • Append bot reply
  • Auto-scroll chat

Acceptance Criteria

  • Message routed correctly end-to-end
  • Session context used properly
  • LLaMA 3 returns response
  • MongoDB updated
  • Postman testing complete
  • Frontend integration working

Testing Steps

  1. Run backend
  2. Send message via Postman
  3. Check model response
  4. Verify MongoDB update
  5. Test frontend chat
  6. Validate context continuity

Definition of Done

  • Message routing fully implemented
  • Model integration working
  • Session-aware chat working
  • Clean API structure

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions