[BUG]

## 🐛 Bug Description
A clear and concise description of what the bug is.

## 🔄 Steps to Reproduce
1. Go to '...'
2. Click on '...'
3. Scroll down to '...'
4. See error

## ✅ Expected Behavior
A clear and concise description of what you expected to happen.

## ❌ Actual Behavior
When chatting with knowledge base, I get below error in the browser console. 
```
STREAM EVENT: error 
Object { error: "'ascii' codec can't encode character '\\U0001f527' in position 0: ordinal not in range(128)" }
​
error: "'ascii' codec can't encode character '\\U0001f527' in position 0: ordinal not in range(128)"
​
<prototype>: Object { … }
session-chat.tsx:257:21
```

## 📸 Screenshots

<img width="2100" height="1718" alt="Image" src="https://github.com/user-attachments/assets/063b2497-4162-44c9-a41b-f0c733029273" />

## 🖥️ Environment Information
**Desktop/Server:**
- OS: Ubuntu 24.04,
- Python Version: 3.11.7
- Node.js Version: 24.3
- Ollama Version: 11.0
- Docker Version: NA

**Browser (if web interface issue):**
- Browser: Firefox

## 📋 System Health Check

<pre>(base) msml@msml:/media/msml/ssd1/adarshtest/localGPT$ python system_health_check.py
🏥 RAG System Health Check
==================================================
🔍 Testing basic imports...
✅ Basic imports successful
🔍 Checking configurations...
📊 External Models: {&apos;embedding_model&apos;: &apos;Qwen/Qwen3-Embedding-0.6B&apos;, &apos;reranker_model&apos;: &apos;answerdotai/answerai-colbert-small-v1&apos;, &apos;vision_model&apos;: &apos;Qwen/Qwen-VL-Chat&apos;, &apos;fallback_reranker&apos;: &apos;BAAI/bge-reranker-base&apos;}
📊 Ollama Config: {&apos;host&apos;: &apos;http://localhost:11434&apos;, &apos;generation_model&apos;: &apos;qwen3:8b&apos;, &apos;enrichment_model&apos;: &apos;qwen3:0.6b&apos;}
📊 Pipeline Configs: {&apos;default&apos;: {&apos;description&apos;: &apos;Production-ready pipeline with hybrid search, AI reranking, and verification&apos;, &apos;storage&apos;: {&apos;lancedb_uri&apos;: &apos;./lancedb&apos;, &apos;text_table_name&apos;: &apos;text_pages_v3&apos;, &apos;image_table_name&apos;: &apos;image_pages_v3&apos;, &apos;bm25_path&apos;: &apos;./index_store/bm25&apos;, &apos;graph_path&apos;: &apos;./index_store/graph/knowledge_graph.gml&apos;}, &apos;retrieval&apos;: {&apos;retriever&apos;: &apos;multivector&apos;, &apos;search_type&apos;: &apos;hybrid&apos;, &apos;late_chunking&apos;: {&apos;enabled&apos;: True, &apos;table_suffix&apos;: &apos;_lc_v3&apos;}, &apos;dense&apos;: {&apos;enabled&apos;: True, &apos;weight&apos;: 0.7}, &apos;bm25&apos;: {&apos;enabled&apos;: True, &apos;index_name&apos;: &apos;rag_bm25_index&apos;}, &apos;graph&apos;: {&apos;enabled&apos;: False, &apos;graph_path&apos;: &apos;./index_store/graph/knowledge_graph.gml&apos;}}, &apos;embedding_model_name&apos;: &apos;Qwen/Qwen3-Embedding-0.6B&apos;, &apos;vision_model_name&apos;: &apos;Qwen/Qwen-VL-Chat&apos;, &apos;reranker&apos;: {&apos;enabled&apos;: True, &apos;type&apos;: &apos;ai&apos;, &apos;strategy&apos;: &apos;rerankers-lib&apos;, &apos;model_name&apos;: &apos;answerdotai/answerai-colbert-small-v1&apos;, &apos;top_k&apos;: 10}, &apos;query_decomposition&apos;: {&apos;enabled&apos;: True, &apos;max_sub_queries&apos;: 3, &apos;compose_from_sub_answers&apos;: True}, &apos;verification&apos;: {&apos;enabled&apos;: True}, &apos;retrieval_k&apos;: 20, &apos;context_window_size&apos;: 0, &apos;semantic_cache_threshold&apos;: 0.98, &apos;cache_scope&apos;: &apos;global&apos;, &apos;contextual_enricher&apos;: {&apos;enabled&apos;: True, &apos;window_size&apos;: 1}, &apos;indexing&apos;: {&apos;embedding_batch_size&apos;: 50, &apos;enrichment_batch_size&apos;: 10, &apos;enable_progress_tracking&apos;: True}}, &apos;fast&apos;: {&apos;description&apos;: &apos;Speed-optimized pipeline with minimal overhead&apos;, &apos;storage&apos;: {&apos;lancedb_uri&apos;: &apos;./lancedb&apos;, &apos;text_table_name&apos;: &apos;text_pages_v3&apos;, &apos;image_table_name&apos;: &apos;image_pages_v3&apos;, &apos;bm25_path&apos;: &apos;./index_store/bm25&apos;}, &apos;retrieval&apos;: {&apos;retriever&apos;: &apos;multivector&apos;, &apos;search_type&apos;: &apos;vector_only&apos;, &apos;late_chunking&apos;: {&apos;enabled&apos;: False}, &apos;dense&apos;: {&apos;enabled&apos;: True}}, &apos;embedding_model_name&apos;: &apos;Qwen/Qwen3-Embedding-0.6B&apos;, &apos;reranker&apos;: {&apos;enabled&apos;: False}, &apos;query_decomposition&apos;: {&apos;enabled&apos;: False}, &apos;verification&apos;: {&apos;enabled&apos;: False}, &apos;retrieval_k&apos;: 10, &apos;context_window_size&apos;: 0, &apos;contextual_enricher&apos;: {&apos;enabled&apos;: False, &apos;window_size&apos;: 1}, &apos;indexing&apos;: {&apos;embedding_batch_size&apos;: 100, &apos;enrichment_batch_size&apos;: 50, &apos;enable_progress_tracking&apos;: False}}, &apos;bm25&apos;: {&apos;enabled&apos;: True, &apos;index_name&apos;: &apos;rag_bm25_index&apos;}, &apos;graph_rag&apos;: {&apos;enabled&apos;: False}}
🔍 Embedding model: Qwen/Qwen3-Embedding-0.6B (1024 dims) - Check data compatibility!
✅ Configuration check completed
🔍 Testing database access...
✅ LanceDB connected - 2 tables available
📋 Available tables:
 - text_pages_2eb70151-f2df-4d1d-b594-681979c8a5f6
 - text_pages_2eb70151-f2df-4d1d-b594-681979c8a5f6_lc
🔍 Testing agent initialization...
Initialized Verifier with Ollama model &apos;qwen3:8b&apos;.
Agent initialized (GraphRAG disabled).
✅ Agent initialization successful
🔍 Testing embedding model...
Initializing HF Embedder with model &apos;Qwen/Qwen3-Embedding-0.6B&apos; on device &apos;cuda&apos;. (first load)
QwenEmbedder weights loaded and cached for Qwen/Qwen3-Embedding-0.6B.
Generating 1 embeddings with Qwen/Qwen3-Embedding-0.6B model...
✅ Embedding model: Qwen/Qwen3-Embedding-0.6B
✅ Vector dimension: 1024
🔍 Using 1024-dim embeddings (Qwen3 compatible) - Ensure data compatibility!
🔍 Testing sample query...
🔍 Testing query on table: text_pages_2eb70151-f2df-4d1d-b594-681979c8a5f6
🔍 ROUTING DEBUG: Starting triage for query: &apos;what is this document about?...&apos;
📖 ROUTING DEBUG: Attempting overview-based routing...
📖 ROUTING DEBUG: No document overviews available, returning None
❌ ROUTING DEBUG: Overview routing returned None, falling back to LLM triage
🤖 ROUTING DEBUG: No history, using LLM fallback triage...
🤖 ROUTING DEBUG: LLM fallback triage decided: &apos;rag_query&apos;
🎯 ROUTING DEBUG: Final triage decision: &apos;rag_query&apos;
Agent Triage Decision: &apos;rag_query&apos;
Generating 1 embeddings with Qwen/Qwen3-Embedding-0.6B model...
✅ ROUTING DEBUG: Executing RAG_QUERY path (query_type=&apos;rag_query&apos;)

--- Query Decomposition Enabled ---
Query Decomposition Reasoning: Single information need; no pronouns or ambiguous references.
Original query: &apos;what is this document about?&apos; (Contextual: &apos;what is this document about?&apos;)
Decomposed into 1 sub-queries: [&apos;what is this document about?&apos;]
--- Only one sub-query after decomposition; using direct retrieval path ---
LanceDB connection established at: ./lancedb

--- Performing Retrieval for query: &apos;what is this document about?&apos; on table &apos;text_pages_2eb70151-f2df-4d1d-b594-681979c8a5f6&apos; ---
Generating 1 embeddings with Qwen/Qwen3-Embedding-0.6B model...
2025-09-06 14:09:40,405 | INFO | rag-system | Top 20 results:
2025-09-06 14:09:40,405 | INFO | rag-system | chunk_id score preview
2025-09-06 14:09:40,405 | INFO | rag-system | ------------------------------
2025-09-06 14:09:40,405 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context highlights that the form for the final…
2025-09-06 14:09:40,405 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context highlights the GOST compensation…
2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context summarizes the Finance Act 2024&apos;s…
2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The chunk outlines the proper officer&apos;s role in…
2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The local context highlights digital signature…
2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The chunk discusses registration periods,…
2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context summary is about the Authority for…
2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context summary is: The chunk discusses the…
2025-09-06 14:09:40,407 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context summary is about sharing details under…
2025-09-06 14:09:40,407 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The chunk discusses regulations for handling…
2025-09-06 14:09:40,407 | INFO | rag-system | f660d12c-80f 7056.611 Context: The context summarizes the specific goods listed…
2025-09-06 14:09:40,407 | INFO | rag-system | f660d12c-80f 7536.421 Context: The context summarizes that electronic…
2025-09-06 14:09:40,407 | INFO | rag-system | a4e76ee3-6c3 7635.182 Context: The specific chunk discusses the Screening…
2025-09-06 14:09:40,407 | INFO | rag-system | f660d12c-80f 7964.705 Context: The context summary is: A company transferring…
2025-09-06 14:09:40,407 | INFO | rag-system | f660d12c-80f 8244.507 Context: The context summarizes that the composition tax…
2025-09-06 14:09:40,408 | INFO | rag-system | a4e76ee3-6c3 8336.688 Context: The context outlines the structure and composition…
2025-09-06 14:09:40,408 | INFO | rag-system | a4e76ee3-6c3 8365.038 Context: The chunk discusses rectifying tax details under…
2025-09-06 14:09:40,408 | INFO | rag-system | f660d12c-80f 8530.757 Context: The context summary is about refund procedures for…
2025-09-06 14:09:40,408 | INFO | rag-system | a4e76ee3-6c3 8545.204 Context: The chunk clarifies that the aggregate value of…
2025-09-06 14:09:40,408 | INFO | rag-system | a4e76ee3-6c3 8654.028 Context: The chunk discusses the Central Government&apos;s…
Retrieved 20 documents.
🔧 Initialising Answer.AI ColBERT reranker (answerdotai/answerai-colbert-small-v1) via rerankers lib…
Loading ColBERTRanker model answerdotai/answerai-colbert-small-v1 (this message can be suppressed by setting verbose=0)
No device set
Using device cuda
No dtype set
Using dtype torch.float32
Loading model answerdotai/answerai-colbert-small-v1, this might take a while...
tokenizer_config.json: 1.24kB [00:00, 11.3MB/s]
vocab.txt: 232kB [00:00, 19.1MB/s]
tokenizer.json: 711kB [00:00, 103MB/s]
special_tokens_map.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 695/695 [00:00&lt;00:00, 15.6MB/s]
config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 702/702 [00:00&lt;00:00, 13.6MB/s]
model.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 134M/134M [00:09&lt;00:00, 13.7MB/s]
Linear Dim set to: 96 for downcasting
✅ AI reranker initialized successfully.

--- Reranking top 20 docs with AI model... ---
❌ Sample query failed: CUDA out of memory. Tried to allocate 16.00 MiB. GPU 0 has a total capacity of 23.65 GiB of which 12.50 MiB is free. Process 38427 has 17.54 GiB memory in use. Including non-PyTorch memory, this process has 1.75 GiB memory in use. Process 128744 has 4.33 GiB memory in use. Of the allocated memory 1.27 GiB is allocated by PyTorch, and 27.98 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

==================================================
🏥 Health Check Complete: 5/6 checks passed
🔍 System mostly healthy with minor issues
</pre>


## 🔧 Configuration
- Deployment method: Direct Python
- Models used: qwen3:0.6b, qwen3:8b
- Document types: PDF

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] #922

🐛 Bug Description

🔄 Steps to Reproduce

✅ Expected Behavior

❌ Actual Behavior

📸 Screenshots

🖥️ Environment Information

📋 System Health Check

🔧 Configuration

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[BUG] #922

Description

🐛 Bug Description

🔄 Steps to Reproduce

✅ Expected Behavior

❌ Actual Behavior

📸 Screenshots

🖥️ Environment Information

📋 System Health Check

🔧 Configuration

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions