π Bug Description
A clear and concise description of what the bug is.
π Steps to Reproduce
- Go to '...'
- Click on '...'
- Scroll down to '...'
- See error
β
Expected Behavior
A clear and concise description of what you expected to happen.
β Actual Behavior
When chatting with knowledge base, I get below error in the browser console.
STREAM EVENT: error
Object { error: "'ascii' codec can't encode character '\\U0001f527' in position 0: ordinal not in range(128)" }
β
error: "'ascii' codec can't encode character '\\U0001f527' in position 0: ordinal not in range(128)"
β
<prototype>: Object { β¦ }
session-chat.tsx:257:21
πΈ Screenshots
π₯οΈ Environment Information
Desktop/Server:
- OS: Ubuntu 24.04,
- Python Version: 3.11.7
- Node.js Version: 24.3
- Ollama Version: 11.0
- Docker Version: NA
Browser (if web interface issue):
π System Health Check
(base) msml@msml:/media/msml/ssd1/adarshtest/localGPT$ python system_health_check.py
π₯ RAG System Health Check
==================================================
π Testing basic imports...
β
Basic imports successful
π Checking configurations...
π External Models: {'embedding_model': 'Qwen/Qwen3-Embedding-0.6B', 'reranker_model': 'answerdotai/answerai-colbert-small-v1', 'vision_model': 'Qwen/Qwen-VL-Chat', 'fallback_reranker': 'BAAI/bge-reranker-base'}
π Ollama Config: {'host': 'http://localhost:11434', 'generation_model': 'qwen3:8b', 'enrichment_model': 'qwen3:0.6b'}
π Pipeline Configs: {'default': {'description': 'Production-ready pipeline with hybrid search, AI reranking, and verification', 'storage': {'lancedb_uri': './lancedb', 'text_table_name': 'text_pages_v3', 'image_table_name': 'image_pages_v3', 'bm25_path': './index_store/bm25', 'graph_path': './index_store/graph/knowledge_graph.gml'}, 'retrieval': {'retriever': 'multivector', 'search_type': 'hybrid', 'late_chunking': {'enabled': True, 'table_suffix': '_lc_v3'}, 'dense': {'enabled': True, 'weight': 0.7}, 'bm25': {'enabled': True, 'index_name': 'rag_bm25_index'}, 'graph': {'enabled': False, 'graph_path': './index_store/graph/knowledge_graph.gml'}}, 'embedding_model_name': 'Qwen/Qwen3-Embedding-0.6B', 'vision_model_name': 'Qwen/Qwen-VL-Chat', 'reranker': {'enabled': True, 'type': 'ai', 'strategy': 'rerankers-lib', 'model_name': 'answerdotai/answerai-colbert-small-v1', 'top_k': 10}, 'query_decomposition': {'enabled': True, 'max_sub_queries': 3, 'compose_from_sub_answers': True}, 'verification': {'enabled': True}, 'retrieval_k': 20, 'context_window_size': 0, 'semantic_cache_threshold': 0.98, 'cache_scope': 'global', 'contextual_enricher': {'enabled': True, 'window_size': 1}, 'indexing': {'embedding_batch_size': 50, 'enrichment_batch_size': 10, 'enable_progress_tracking': True}}, 'fast': {'description': 'Speed-optimized pipeline with minimal overhead', 'storage': {'lancedb_uri': './lancedb', 'text_table_name': 'text_pages_v3', 'image_table_name': 'image_pages_v3', 'bm25_path': './index_store/bm25'}, 'retrieval': {'retriever': 'multivector', 'search_type': 'vector_only', 'late_chunking': {'enabled': False}, 'dense': {'enabled': True}}, 'embedding_model_name': 'Qwen/Qwen3-Embedding-0.6B', 'reranker': {'enabled': False}, 'query_decomposition': {'enabled': False}, 'verification': {'enabled': False}, 'retrieval_k': 10, 'context_window_size': 0, 'contextual_enricher': {'enabled': False, 'window_size': 1}, 'indexing': {'embedding_batch_size': 100, 'enrichment_batch_size': 50, 'enable_progress_tracking': False}}, 'bm25': {'enabled': True, 'index_name': 'rag_bm25_index'}, 'graph_rag': {'enabled': False}}
π Embedding model: Qwen/Qwen3-Embedding-0.6B (1024 dims) - Check data compatibility!
β
Configuration check completed
π Testing database access...
β
LanceDB connected - 2 tables available
π Available tables:
- text_pages_2eb70151-f2df-4d1d-b594-681979c8a5f6
- text_pages_2eb70151-f2df-4d1d-b594-681979c8a5f6_lc
π Testing agent initialization...
Initialized Verifier with Ollama model 'qwen3:8b'.
Agent initialized (GraphRAG disabled).
β
Agent initialization successful
π Testing embedding model...
Initializing HF Embedder with model 'Qwen/Qwen3-Embedding-0.6B' on device 'cuda'. (first load)
QwenEmbedder weights loaded and cached for Qwen/Qwen3-Embedding-0.6B.
Generating 1 embeddings with Qwen/Qwen3-Embedding-0.6B model...
β
Embedding model: Qwen/Qwen3-Embedding-0.6B
β
Vector dimension: 1024
π Using 1024-dim embeddings (Qwen3 compatible) - Ensure data compatibility!
π Testing sample query...
π Testing query on table: text_pages_2eb70151-f2df-4d1d-b594-681979c8a5f6
π ROUTING DEBUG: Starting triage for query: 'what is this document about?...'
π ROUTING DEBUG: Attempting overview-based routing...
π ROUTING DEBUG: No document overviews available, returning None
β ROUTING DEBUG: Overview routing returned None, falling back to LLM triage
π€ ROUTING DEBUG: No history, using LLM fallback triage...
π€ ROUTING DEBUG: LLM fallback triage decided: 'rag_query'
π― ROUTING DEBUG: Final triage decision: 'rag_query'
Agent Triage Decision: 'rag_query'
Generating 1 embeddings with Qwen/Qwen3-Embedding-0.6B model...
β
ROUTING DEBUG: Executing RAG_QUERY path (query_type='rag_query')
--- Query Decomposition Enabled ---
Query Decomposition Reasoning: Single information need; no pronouns or ambiguous references.
Original query: 'what is this document about?' (Contextual: 'what is this document about?')
Decomposed into 1 sub-queries: ['what is this document about?']
--- Only one sub-query after decomposition; using direct retrieval path ---
LanceDB connection established at: ./lancedb
--- Performing Retrieval for query: 'what is this document about?' on table 'text_pages_2eb70151-f2df-4d1d-b594-681979c8a5f6' ---
Generating 1 embeddings with Qwen/Qwen3-Embedding-0.6B model...
2025-09-06 14:09:40,405 | INFO | rag-system | Top 20 results:
2025-09-06 14:09:40,405 | INFO | rag-system | chunk_id score preview
2025-09-06 14:09:40,405 | INFO | rag-system | ------------------------------
2025-09-06 14:09:40,405 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context highlights that the form for the finalβ¦
2025-09-06 14:09:40,405 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context highlights the GOST compensationβ¦
2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context summarizes the Finance Act 2024'sβ¦
2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The chunk outlines the proper officer's role inβ¦
2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The local context highlights digital signatureβ¦
2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The chunk discusses registration periods,β¦
2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context summary is about the Authority forβ¦
2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context summary is: The chunk discusses theβ¦
2025-09-06 14:09:40,407 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context summary is about sharing details underβ¦
2025-09-06 14:09:40,407 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The chunk discusses regulations for handlingβ¦
2025-09-06 14:09:40,407 | INFO | rag-system | f660d12c-80f 7056.611 Context: The context summarizes the specific goods listedβ¦
2025-09-06 14:09:40,407 | INFO | rag-system | f660d12c-80f 7536.421 Context: The context summarizes that electronicβ¦
2025-09-06 14:09:40,407 | INFO | rag-system | a4e76ee3-6c3 7635.182 Context: The specific chunk discusses the Screeningβ¦
2025-09-06 14:09:40,407 | INFO | rag-system | f660d12c-80f 7964.705 Context: The context summary is: A company transferringβ¦
2025-09-06 14:09:40,407 | INFO | rag-system | f660d12c-80f 8244.507 Context: The context summarizes that the composition taxβ¦
2025-09-06 14:09:40,408 | INFO | rag-system | a4e76ee3-6c3 8336.688 Context: The context outlines the structure and compositionβ¦
2025-09-06 14:09:40,408 | INFO | rag-system | a4e76ee3-6c3 8365.038 Context: The chunk discusses rectifying tax details underβ¦
2025-09-06 14:09:40,408 | INFO | rag-system | f660d12c-80f 8530.757 Context: The context summary is about refund procedures forβ¦
2025-09-06 14:09:40,408 | INFO | rag-system | a4e76ee3-6c3 8545.204 Context: The chunk clarifies that the aggregate value ofβ¦
2025-09-06 14:09:40,408 | INFO | rag-system | a4e76ee3-6c3 8654.028 Context: The chunk discusses the Central Government'sβ¦
Retrieved 20 documents.
π§ Initialising Answer.AI ColBERT reranker (answerdotai/answerai-colbert-small-v1) via rerankers libβ¦
Loading ColBERTRanker model answerdotai/answerai-colbert-small-v1 (this message can be suppressed by setting verbose=0)
No device set
Using device cuda
No dtype set
Using dtype torch.float32
Loading model answerdotai/answerai-colbert-small-v1, this might take a while...
tokenizer_config.json: 1.24kB [00:00, 11.3MB/s]
vocab.txt: 232kB [00:00, 19.1MB/s]
tokenizer.json: 711kB [00:00, 103MB/s]
special_tokens_map.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 695/695 [00:00<00:00, 15.6MB/s]
config.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 702/702 [00:00<00:00, 13.6MB/s]
model.safetensors: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 134M/134M [00:09<00:00, 13.7MB/s]
Linear Dim set to: 96 for downcasting
β
AI reranker initialized successfully.
--- Reranking top 20 docs with AI model... ---
β Sample query failed: CUDA out of memory. Tried to allocate 16.00 MiB. GPU 0 has a total capacity of 23.65 GiB of which 12.50 MiB is free. Process 38427 has 17.54 GiB memory in use. Including non-PyTorch memory, this process has 1.75 GiB memory in use. Process 128744 has 4.33 GiB memory in use. Of the allocated memory 1.27 GiB is allocated by PyTorch, and 27.98 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
==================================================
π₯ Health Check Complete: 5/6 checks passed
π System mostly healthy with minor issues
π§ Configuration
- Deployment method: Direct Python
- Models used: qwen3:0.6b, qwen3:8b
- Document types: PDF
π Bug Description
A clear and concise description of what the bug is.
π Steps to Reproduce
β Expected Behavior
A clear and concise description of what you expected to happen.
β Actual Behavior
When chatting with knowledge base, I get below error in the browser console.
πΈ Screenshots
π₯οΈ Environment Information
Desktop/Server:
Browser (if web interface issue):
π System Health Check
(base) msml@msml:/media/msml/ssd1/adarshtest/localGPT$ python system_health_check.py π₯ RAG System Health Check ================================================== π Testing basic imports... β Basic imports successful π Checking configurations... π External Models: {'embedding_model': 'Qwen/Qwen3-Embedding-0.6B', 'reranker_model': 'answerdotai/answerai-colbert-small-v1', 'vision_model': 'Qwen/Qwen-VL-Chat', 'fallback_reranker': 'BAAI/bge-reranker-base'} π Ollama Config: {'host': 'http://localhost:11434', 'generation_model': 'qwen3:8b', 'enrichment_model': 'qwen3:0.6b'} π Pipeline Configs: {'default': {'description': 'Production-ready pipeline with hybrid search, AI reranking, and verification', 'storage': {'lancedb_uri': './lancedb', 'text_table_name': 'text_pages_v3', 'image_table_name': 'image_pages_v3', 'bm25_path': './index_store/bm25', 'graph_path': './index_store/graph/knowledge_graph.gml'}, 'retrieval': {'retriever': 'multivector', 'search_type': 'hybrid', 'late_chunking': {'enabled': True, 'table_suffix': '_lc_v3'}, 'dense': {'enabled': True, 'weight': 0.7}, 'bm25': {'enabled': True, 'index_name': 'rag_bm25_index'}, 'graph': {'enabled': False, 'graph_path': './index_store/graph/knowledge_graph.gml'}}, 'embedding_model_name': 'Qwen/Qwen3-Embedding-0.6B', 'vision_model_name': 'Qwen/Qwen-VL-Chat', 'reranker': {'enabled': True, 'type': 'ai', 'strategy': 'rerankers-lib', 'model_name': 'answerdotai/answerai-colbert-small-v1', 'top_k': 10}, 'query_decomposition': {'enabled': True, 'max_sub_queries': 3, 'compose_from_sub_answers': True}, 'verification': {'enabled': True}, 'retrieval_k': 20, 'context_window_size': 0, 'semantic_cache_threshold': 0.98, 'cache_scope': 'global', 'contextual_enricher': {'enabled': True, 'window_size': 1}, 'indexing': {'embedding_batch_size': 50, 'enrichment_batch_size': 10, 'enable_progress_tracking': True}}, 'fast': {'description': 'Speed-optimized pipeline with minimal overhead', 'storage': {'lancedb_uri': './lancedb', 'text_table_name': 'text_pages_v3', 'image_table_name': 'image_pages_v3', 'bm25_path': './index_store/bm25'}, 'retrieval': {'retriever': 'multivector', 'search_type': 'vector_only', 'late_chunking': {'enabled': False}, 'dense': {'enabled': True}}, 'embedding_model_name': 'Qwen/Qwen3-Embedding-0.6B', 'reranker': {'enabled': False}, 'query_decomposition': {'enabled': False}, 'verification': {'enabled': False}, 'retrieval_k': 10, 'context_window_size': 0, 'contextual_enricher': {'enabled': False, 'window_size': 1}, 'indexing': {'embedding_batch_size': 100, 'enrichment_batch_size': 50, 'enable_progress_tracking': False}}, 'bm25': {'enabled': True, 'index_name': 'rag_bm25_index'}, 'graph_rag': {'enabled': False}} π Embedding model: Qwen/Qwen3-Embedding-0.6B (1024 dims) - Check data compatibility! β Configuration check completed π Testing database access... β LanceDB connected - 2 tables available π Available tables: - text_pages_2eb70151-f2df-4d1d-b594-681979c8a5f6 - text_pages_2eb70151-f2df-4d1d-b594-681979c8a5f6_lc π Testing agent initialization... Initialized Verifier with Ollama model 'qwen3:8b'. Agent initialized (GraphRAG disabled). β Agent initialization successful π Testing embedding model... Initializing HF Embedder with model 'Qwen/Qwen3-Embedding-0.6B' on device 'cuda'. (first load) QwenEmbedder weights loaded and cached for Qwen/Qwen3-Embedding-0.6B. Generating 1 embeddings with Qwen/Qwen3-Embedding-0.6B model... β Embedding model: Qwen/Qwen3-Embedding-0.6B β Vector dimension: 1024 π Using 1024-dim embeddings (Qwen3 compatible) - Ensure data compatibility! π Testing sample query... π Testing query on table: text_pages_2eb70151-f2df-4d1d-b594-681979c8a5f6 π ROUTING DEBUG: Starting triage for query: 'what is this document about?...' π ROUTING DEBUG: Attempting overview-based routing... π ROUTING DEBUG: No document overviews available, returning None β ROUTING DEBUG: Overview routing returned None, falling back to LLM triage π€ ROUTING DEBUG: No history, using LLM fallback triage... π€ ROUTING DEBUG: LLM fallback triage decided: 'rag_query' π― ROUTING DEBUG: Final triage decision: 'rag_query' Agent Triage Decision: 'rag_query' Generating 1 embeddings with Qwen/Qwen3-Embedding-0.6B model... β ROUTING DEBUG: Executing RAG_QUERY path (query_type='rag_query') --- Query Decomposition Enabled --- Query Decomposition Reasoning: Single information need; no pronouns or ambiguous references. Original query: 'what is this document about?' (Contextual: 'what is this document about?') Decomposed into 1 sub-queries: ['what is this document about?'] --- Only one sub-query after decomposition; using direct retrieval path --- LanceDB connection established at: ./lancedb --- Performing Retrieval for query: 'what is this document about?' on table 'text_pages_2eb70151-f2df-4d1d-b594-681979c8a5f6' --- Generating 1 embeddings with Qwen/Qwen3-Embedding-0.6B model... 2025-09-06 14:09:40,405 | INFO | rag-system | Top 20 results: 2025-09-06 14:09:40,405 | INFO | rag-system | chunk_id score preview 2025-09-06 14:09:40,405 | INFO | rag-system | ------------------------------ 2025-09-06 14:09:40,405 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context highlights that the form for the finalβ¦ 2025-09-06 14:09:40,405 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context highlights the GOST compensationβ¦ 2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context summarizes the Finance Act 2024'sβ¦ 2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The chunk outlines the proper officer's role inβ¦ 2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The local context highlights digital signatureβ¦ 2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The chunk discusses registration periods,β¦ 2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context summary is about the Authority forβ¦ 2025-09-06 14:09:40,406 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context summary is: The chunk discusses theβ¦ 2025-09-06 14:09:40,407 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The context summary is about sharing details underβ¦ 2025-09-06 14:09:40,407 | INFO | rag-system | a4e76ee3-6c3 0.000 Context: The chunk discusses regulations for handlingβ¦ 2025-09-06 14:09:40,407 | INFO | rag-system | f660d12c-80f 7056.611 Context: The context summarizes the specific goods listedβ¦ 2025-09-06 14:09:40,407 | INFO | rag-system | f660d12c-80f 7536.421 Context: The context summarizes that electronicβ¦ 2025-09-06 14:09:40,407 | INFO | rag-system | a4e76ee3-6c3 7635.182 Context: The specific chunk discusses the Screeningβ¦ 2025-09-06 14:09:40,407 | INFO | rag-system | f660d12c-80f 7964.705 Context: The context summary is: A company transferringβ¦ 2025-09-06 14:09:40,407 | INFO | rag-system | f660d12c-80f 8244.507 Context: The context summarizes that the composition taxβ¦ 2025-09-06 14:09:40,408 | INFO | rag-system | a4e76ee3-6c3 8336.688 Context: The context outlines the structure and compositionβ¦ 2025-09-06 14:09:40,408 | INFO | rag-system | a4e76ee3-6c3 8365.038 Context: The chunk discusses rectifying tax details underβ¦ 2025-09-06 14:09:40,408 | INFO | rag-system | f660d12c-80f 8530.757 Context: The context summary is about refund procedures forβ¦ 2025-09-06 14:09:40,408 | INFO | rag-system | a4e76ee3-6c3 8545.204 Context: The chunk clarifies that the aggregate value ofβ¦ 2025-09-06 14:09:40,408 | INFO | rag-system | a4e76ee3-6c3 8654.028 Context: The chunk discusses the Central Government'sβ¦ Retrieved 20 documents. π§ Initialising Answer.AI ColBERT reranker (answerdotai/answerai-colbert-small-v1) via rerankers libβ¦ Loading ColBERTRanker model answerdotai/answerai-colbert-small-v1 (this message can be suppressed by setting verbose=0) No device set Using device cuda No dtype set Using dtype torch.float32 Loading model answerdotai/answerai-colbert-small-v1, this might take a while... tokenizer_config.json: 1.24kB [00:00, 11.3MB/s] vocab.txt: 232kB [00:00, 19.1MB/s] tokenizer.json: 711kB [00:00, 103MB/s] special_tokens_map.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 695/695 [00:00<00:00, 15.6MB/s] config.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 702/702 [00:00<00:00, 13.6MB/s] model.safetensors: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 134M/134M [00:09<00:00, 13.7MB/s] Linear Dim set to: 96 for downcasting β AI reranker initialized successfully. --- Reranking top 20 docs with AI model... --- β Sample query failed: CUDA out of memory. Tried to allocate 16.00 MiB. GPU 0 has a total capacity of 23.65 GiB of which 12.50 MiB is free. Process 38427 has 17.54 GiB memory in use. Including non-PyTorch memory, this process has 1.75 GiB memory in use. Process 128744 has 4.33 GiB memory in use. Of the allocated memory 1.27 GiB is allocated by PyTorch, and 27.98 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) ================================================== π₯ Health Check Complete: 5/6 checks passed π System mostly healthy with minor issuesπ§ Configuration