Skip to content

Running research on local documents doesn't work. #1534

@aclifton314

Description

@aclifton314

Describe the bug
gpt-researcher does not run research over local documents.

To Reproduce
Steps to reproduce the behavior:

  1. Clone gpt-researcher
  2. Create a directory in the root called summarize.
  3. Place a pdf paper in that directory.
  4. Include the environment variable DOC_PATH in the .env file at the project root. (DOC_PATH=./summarize).
  5. Start gpt-researcher, as the model to summarize the paper, select My Documents in the sources drop down menu, run the researcher.

Expected behavior
Output a summary of the paper in the local documents directory.

Desktop (please complete the following information):

  • OS: Ubuntu 24.0.3 LTS vis WSL
  • Browser Firefox 144.0
  • Version: commit 906e94f

Additional context
I have tried the following in the .env file, individually:

DOC_PATH=./summarize
DOC_PATH="./summarize"
DOC_PATH=/path/to/gpt-researcher/summarize
DOC_PATH="/path/to/gpt-researcher/summarize"

and the following in my .bashrc alone:

DOC_PATH="/path/to/gpt-researcher/summarize"

Log

INFO:     Will watch for changes in these directories: ['/home/aclifton/gpt-researcher']
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [12851] using StatReload
INFO:     Started server process [12853]
INFO:     Waiting for application startup.
2025-10-16 13:16:00,619 - backend.server.app - INFO - Frontend mounted from: /home/aclifton/gpt-researcher/frontend
2025-10-16 13:16:00,620 - backend.server.app - INFO - Static assets mounted from: /home/aclifton/gpt-researcher/frontend/static
2025-10-16 13:16:00,620 - backend.server.app - INFO - Research API started - no database required
INFO:     Application startup complete.
INFO:     127.0.0.1:55478 - "GET / HTTP/1.1" 200 OK
DEBUG:    = connection is CONNECTING
DEBUG:    < GET /ws HTTP/1.1
DEBUG:    < host: 127.0.0.1:8000
DEBUG:    < user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:144.0) Gecko/20100101 Firefox/144.0
DEBUG:    < accept: */*
DEBUG:    < accept-language: en-US,en;q=0.5
DEBUG:    < accept-encoding: gzip, deflate, br, zstd
DEBUG:    < sec-websocket-version: 13
DEBUG:    < origin: http://127.0.0.1:8000
DEBUG:    < sec-websocket-extensions: permessage-deflate
DEBUG:    < sec-websocket-key: dZk5vmjHKqZgC3CLUp9Rkw==
DEBUG:    < connection: keep-alive, Upgrade
DEBUG:    < cookie: conversationHistory=%5B%7B%22prompt%22%3A%22Summarize%20the%20paper.%22%2C%22links%22%3A%7B%22pdf%22%3A%22%22%2C%22docx%22%3A%22outputs%2Ftask_1760641578_Summarize%2520the%2520paper.docx%22%2C%22md%22%3A%22outputs%2Ftask_1760641578_Summarize%2520the%2520paper.md%22%2C%22json%22%3A%22outputs%2Ftask_1760641578_Summarize%20the%20paper.json%22%7D%2C%22timestamp%22%3A%222025-10-16T19%3A07%3A02.984Z%22%7D%2C%7B%22prompt%22%3A%22Summarize%20the%20paper.%22%2C%22links%22%3A%7B%22pdf%22%3A%22%22%2C%22docx%22%3A%22outputs%2Ftask_1760641429_Summarize%2520the%2520paper.docx%22%2C%22md%22%3A%22outputs%2Ftask_1760641429_Summarize%2520the%2520paper.md%22%2C%22json%22%3A%22outputs%2Ftask_1760641429_Summarize%20the%20paper.json%22%7D%2C%22timestamp%22%3A%222025-10-16T19%3A04%3A15.445Z%22%7D%2C%7B%22prompt%22%3A%22summarize%20the%20paper.%22%2C%22links%22%3A%7B%22pdf%22%3A%22%22%2C%22docx%22%3A%22outputs%2Ftask_1760641336_summarize%2520the%2520paper.docx%22%2C%22md%22%3A%22outputs%2Ftask_1760641336_summarize%2520the%2520paper.md%22%2C%22json%22%3A%22outputs%2Ftask_1760641336_summarize%20the%20paper.json%22%7D%2C%22timestamp%22%3A%222025-10-16T19%3A02%3A43.479Z%22%7D%2C%7B%22prompt%22%3A%22Summarize%20the%20paper.%22%2C%22links%22%3A%7B%22pdf%22%3A%22%22%2C%22docx%22%3A%22outputs%2Ftask_1760641120_Summarize%2520the%2520paper.docx%22%2C%22md%22%3A%22outputs%2Ftask_1760641120_Summarize%2520the%2520paper.md%22%2C%22json%22%3A%22outputs%2Ftask_1760641120_Summarize%20the%20paper.json%22%7D%2C%22timestamp%22%3A%222025-10-16T18%3A59%3A20.735Z%22%7D%2C%7B%22prompt%22%3A%22Summarize%20the%20paper.%22%2C%22links%22%3A%7B%22pdf%22%3A%22%22%2C%22docx%22%3A%22outputs%2Ftask_1760641022_Summarize%2520the%2520paper.docx%22%2C%22md%22%3A%22outputs%2Ftask_1760641022_Summarize%2520the%2520paper.md%22%2C%22json%22%3A%22outputs%2Ftask_1760641022_Summarize%20the%20paper.json%22%7D%2C%22timestamp%22%3A%222025-10-16T18%3A57%3A34.128Z%22%7D%2C%7B%22prompt%22%3A%22How%20does%20this%20paper%20address%20the%20scheduling%20problem%3F%22%2C%22links%22%3A%7B%22pdf%22%3A%22%22%2C%22docx%22%3A%22outputs%2Ftask_1760640560_How%2520does%2520this%2520paper%2520address%2520the%2520scheduling%2520p.docx%22%2C%22md%22%3A%22outputs%2Ftask_1760640560_How%2520does%2520this%2520paper%2520address%2520the%2520scheduling%2520p.md%22%2C%22json%22%3A%22outputs%2Ftask_1760640560_How%20does%20this%20paper%20address%20the%20scheduling%20problem.json%22%7D%2C%22timestamp%22%3A%222025-10-16T18%3A50%3A23.465Z%22%7D%2C%7B%22prompt%22%3A%22Please%20summarize%20the%20document.%22%2C%22links%22%3A%7B%22pdf%22%3A%22%22%2C%22docx%22%3A%22outputs%2Ftask_1760640283_Please%2520summarize%2520the%2520document.docx%22%2C%22md%22%3A%22outputs%2Ftask_1760640283_Please%2520summarize%2520the%2520document.md%22%2C%22json%22%3A%22outputs%2Ftask_1760640283_Please%20summarize%20the%20document.json%22%7D%2C%22timestamp%22%3A%222025-10-16T18%3A45%3A09.046Z%22%7D%2C%7B%22prompt%22%3A%22Please%20summarize%20the%20papers%20in%20the%20directory.%22%2C%22links%22%3A%7B%22pdf%22%3A%22%22%2C%22docx%22%3A%22outputs%2Ftask_1760640028_Please%2520summarize%2520the%2520papers%2520in%2520the%2520directory.docx%22%2C%22md%22%3A%22outputs%2Ftask_1760640028_Please%2520summarize%2520the%2520papers%2520in%2520the%2520directory.md%22%2C%22json%22%3A%22outputs%2Ftask_1760640028_Please%20summarize%20the%20papers%20in%20the%20directory.json%22%7D%2C%22timestamp%22%3A%222025-10-16T18%3A41%3A19.456Z%22%7D%5D
DEBUG:    < sec-fetch-dest: empty
DEBUG:    < sec-fetch-mode: websocket
DEBUG:    < sec-fetch-site: same-origin
DEBUG:    < pragma: no-cache
DEBUG:    < cache-control: no-cache
DEBUG:    < upgrade: websocket
INFO:     127.0.0.1:55480 - "WebSocket /ws" [accepted]
DEBUG:    > HTTP/1.1 101 Switching Protocols
DEBUG:    > Upgrade: websocket
DEBUG:    > Connection: Upgrade
DEBUG:    > Sec-WebSocket-Accept: v3NZQES+m9cZUc384Hl4T24o4s8=
DEBUG:    > Sec-WebSocket-Extensions: permessage-deflate
DEBUG:    > date: Thu, 16 Oct 2025 19:16:14 GMT
DEBUG:    > server: uvicorn
INFO:     connection open
DEBUG:    = connection is OPEN
DEBUG:    < TEXT 'start {"task":"Summarize the paper.","report_ty...nt","query_domains":[]}' [169 bytes]
2025-10-16 13:16:15,341 - server.server_utils - INFO - Received WebSocket message: start {"task":"Summarize the paper.","report_type"...
2025-10-16 13:16:15,341 - server.server_utils - INFO - Processing start command
DEBUG:    > TEXT '{"query":"Summarize the paper.","sources":[],"context":[],"report":""}' [70 bytes]
2025-10-16 13:16:42,441 - httpx - INFO - HTTP Request: POST https://vllm.cessna.stratagemgroup.run/v1/chat/completions "HTTP/1.1 200 OK"
⚠️ Error in reading JSON and failed to repair with json_repair: 'str' object has no attribute 'get'
⚠️ LLM Response: `I’m happy to help summarize the paper, but I’ll need a bit more information first. Could you please provide:

1. The title (or a link) of the paper you’d like summarized, **or**
2. A brief description of its subject area (e.g., computer science, biology, economics, etc.) and any specific sections you’re most interested in.

Once I have those details, I can select the appropriate research‑assistant “server” and give you a clear, well‑structured summary.`
No JSON found in the string. Falling back to Default Agent.
2025-10-16 13:16:42,719 - research - INFO - Starting research for query: Summarize the paper.
2025-10-16 13:16:42,719 - research - INFO - Active retrievers: ['TavilySearch']
INFO:     [13:16:42] 🔍 Starting the research task for 'Summarize the paper.'...
DEBUG:    > TEXT '{"type":"logs","content":"starting_research","o...\'...","metadata":null}' [134 bytes]
INFO:     [13:16:42] Default Agent
DEBUG:    > TEXT '{"type":"logs","content":"agent_generated","out...Agent","metadata":null}' [84 bytes]
2025-10-16 13:16:42,724 - research - INFO - Using local search
2025-10-16 13:16:42,825 - research - INFO - Loaded 7 documents
2025-10-16 13:16:42,825 - research - INFO - Starting web search for query: Summarize the paper.
INFO:     [13:16:42] 🌐 Browsing the web to learn more about the task: Summarize the paper....
DEBUG:    > TEXT '{"type":"logs","content":"planning_research","o...r....","metadata":null}' [148 bytes]
2025-10-16 13:16:43,310 - research - INFO - Initial search results obtained: 10 results
INFO:     [13:16:43] 🤔 Planning the research strategy and subtasks...
DEBUG:    > TEXT '{"type":"logs","content":"planning_research","o...ks...","metadata":null}' [124 bytes]
2025-10-16 13:16:49,008 - httpx - INFO - HTTP Request: POST https://vllm.cessna.stratagemgroup.run/v1/chat/completions "HTTP/1.1 200 OK"
2025-10-16 13:16:49,013 - research - INFO - Research outline planned: ['objective guidelines for writing a concise research paper summary 2025', 'how to create an unbiased academic summary of a scientific article', 'comparison of AI tools for objectively summarizing research papers October 2025']
2025-10-16 13:16:49,013 - research - INFO - Generated sub-queries: ['objective guidelines for writing a concise research paper summary 2025', 'how to create an unbiased academic summary of a scientific article', 'comparison of AI tools for objectively summarizing research papers October 2025']
INFO:     [13:16:49] 🗂️ I will conduct my research based on the following queries: ['objective guidelines for writing a concise research paper summary 2025', 'how to create an unbiased academic summary of a scientific article', 'comparison of AI tools for objectively summarizing research papers October 2025', 'Summarize the paper.']...
DEBUG:    > TEXT '{"type":"logs","content":"subqueries","output":...Summarize the paper."]}' [631 bytes]
INFO:     [13:16:49]
🔍 Running research for 'objective guidelines for writing a concise research paper summary 2025'...
DEBUG:    > TEXT '{"type":"logs","content":"running_subquery_rese...\'...","metadata":null}' [184 bytes]
INFO:     [13:16:49] 📚 Getting relevant content based on query: objective guidelines for writing a concise research paper summary 2025...
DEBUG:    > TEXT '{"type":"logs","content":"fetching_query_conten...25...","metadata":null}' [197 bytes]
INFO:     [13:16:49]
🔍 Running research for 'how to create an unbiased academic summary of a scientific article'...
DEBUG:    > TEXT '{"type":"logs","content":"running_subquery_rese...\'...","metadata":null}' [180 bytes]
INFO:     [13:16:49] 📚 Getting relevant content based on query: how to create an unbiased academic summary of a scientific article...
DEBUG:    > TEXT '{"type":"logs","content":"fetching_query_conten...le...","metadata":null}' [193 bytes]
INFO:     [13:16:49]
🔍 Running research for 'comparison of AI tools for objectively summarizing research papers October 2025'...
DEBUG:    > TEXT '{"type":"logs","content":"running_subquery_rese...\'...","metadata":null}' [193 bytes]
INFO:     [13:16:49] 📚 Getting relevant content based on query: comparison of AI tools for objectively summarizing research papers October 2025...
DEBUG:    > TEXT '{"type":"logs","content":"fetching_query_conten...25...","metadata":null}' [206 bytes]
INFO:     [13:16:49]
🔍 Running research for 'Summarize the paper.'...
DEBUG:    > TEXT '{"type":"logs","content":"running_subquery_rese...\'...","metadata":null}' [134 bytes]
INFO:     [13:16:49] 📚 Getting relevant content based on query: Summarize the paper....
DEBUG:    > TEXT '{"type":"logs","content":"fetching_query_conten...r....","metadata":null}' [147 bytes]
2025-10-16 13:16:49,559 - httpx - INFO - HTTP Request: POST https://vllm.cessna.stratagemgroup.run/v1/embeddings "HTTP/1.1 200 OK"
2025-10-16 13:16:49,562 - research - ERROR - Error processing sub-query objective guidelines for writing a concise research paper summary 2025: No embedding data received
Traceback (most recent call last):
  File "/home/aclifton/gpt-researcher/gpt_researcher/skills/researcher.py", line 504, in _process_sub_query
    web_context = await self.researcher.context_manager.get_similar_content_by_query(sub_query, scraped_data)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/gpt_researcher/skills/context_manager.py", line 29, in get_similar_content_by_query
    return await context_compressor.async_get_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/gpt_researcher/context/compression.py", line 76, in async_get_context
    relevant_docs = await asyncio.to_thread(compressed_docs.invoke, query, **self.kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_core/retrievers.py", line 263, in invoke
    result = self._get_relevant_documents(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain/retrievers/contextual_compression.py", line 46, in _get_relevant_documents
    compressed_docs = self.base_compressor.compress_documents(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain/retrievers/document_compressors/base.py", line 40, in compress_documents
    documents = _transformer.compress_documents(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain/retrievers/document_compressors/embeddings_filter.py", line 78, in compress_documents
    embedded_documents = _get_embeddings_from_stateful_docs(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_community/document_transformers/embeddings_redundant_filter.py", line 71, in _get_embeddings_from_stateful_docs
    embedded_documents = embeddings.embed_documents(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_openai/embeddings/base.py", line 591, in embed_documents
    return self._get_len_safe_embeddings(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_openai/embeddings/base.py", line 479, in _get_len_safe_embeddings
    response = self.client.create(
               ^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/resources/embeddings.py", line 132, in create
    return self._post(
           ^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1052, in request
    return self._process_response(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1141, in _process_response
    return api_response.parse()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_response.py", line 325, in parse
    parsed = self._options.post_parser(parsed)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/resources/embeddings.py", line 116, in parser
    raise ValueError("No embedding data received")
ValueError: No embedding data received
INFO:     [13:16:49] ❌ Error processing 'objective guidelines for writing a concise research paper summary 2025': No embedding data received
DEBUG:    > TEXT '{"type":"logs","content":"subquery_error","outp...eived","metadata":null}' [191 bytes]
2025-10-16 13:16:49,570 - httpx - INFO - HTTP Request: POST https://vllm.cessna.stratagemgroup.run/v1/embeddings "HTTP/1.1 200 OK"
2025-10-16 13:16:49,571 - research - ERROR - Error processing sub-query how to create an unbiased academic summary of a scientific article: No embedding data received
Traceback (most recent call last):
  File "/home/aclifton/gpt-researcher/gpt_researcher/skills/researcher.py", line 504, in _process_sub_query
    web_context = await self.researcher.context_manager.get_similar_content_by_query(sub_query, scraped_data)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/gpt_researcher/skills/context_manager.py", line 29, in get_similar_content_by_query
    return await context_compressor.async_get_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/gpt_researcher/context/compression.py", line 76, in async_get_context
    relevant_docs = await asyncio.to_thread(compressed_docs.invoke, query, **self.kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_core/retrievers.py", line 263, in invoke
    result = self._get_relevant_documents(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain/retrievers/contextual_compression.py", line 46, in _get_relevant_documents
    compressed_docs = self.base_compressor.compress_documents(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain/retrievers/document_compressors/base.py", line 40, in compress_documents
    documents = _transformer.compress_documents(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain/retrievers/document_compressors/embeddings_filter.py", line 78, in compress_documents
    embedded_documents = _get_embeddings_from_stateful_docs(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_community/document_transformers/embeddings_redundant_filter.py", line 71, in _get_embeddings_from_stateful_docs
    embedded_documents = embeddings.embed_documents(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_openai/embeddings/base.py", line 591, in embed_documents
    return self._get_len_safe_embeddings(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_openai/embeddings/base.py", line 479, in _get_len_safe_embeddings
    response = self.client.create(
               ^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/resources/embeddings.py", line 132, in create
    return self._post(
           ^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1052, in request
    return self._process_response(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1141, in _process_response
    return api_response.parse()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_response.py", line 325, in parse
    parsed = self._options.post_parser(parsed)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/resources/embeddings.py", line 116, in parser
    raise ValueError("No embedding data received")
ValueError: No embedding data received
INFO:     [13:16:49] ❌ Error processing 'how to create an unbiased academic summary of a scientific article': No embedding data received
DEBUG:    > TEXT '{"type":"logs","content":"subquery_error","outp...eived","metadata":null}' [187 bytes]
2025-10-16 13:16:49,605 - httpx - INFO - HTTP Request: POST https://vllm.cessna.stratagemgroup.run/v1/embeddings "HTTP/1.1 200 OK"
2025-10-16 13:16:49,605 - httpx - INFO - HTTP Request: POST https://vllm.cessna.stratagemgroup.run/v1/embeddings "HTTP/1.1 200 OK"
2025-10-16 13:16:49,607 - research - ERROR - Error processing sub-query Summarize the paper.: No embedding data received
Traceback (most recent call last):
  File "/home/aclifton/gpt-researcher/gpt_researcher/skills/researcher.py", line 504, in _process_sub_query
    web_context = await self.researcher.context_manager.get_similar_content_by_query(sub_query, scraped_data)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/gpt_researcher/skills/context_manager.py", line 29, in get_similar_content_by_query
    return await context_compressor.async_get_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/gpt_researcher/context/compression.py", line 76, in async_get_context
    relevant_docs = await asyncio.to_thread(compressed_docs.invoke, query, **self.kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_core/retrievers.py", line 263, in invoke
    result = self._get_relevant_documents(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain/retrievers/contextual_compression.py", line 46, in _get_relevant_documents
    compressed_docs = self.base_compressor.compress_documents(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain/retrievers/document_compressors/base.py", line 40, in compress_documents
    documents = _transformer.compress_documents(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain/retrievers/document_compressors/embeddings_filter.py", line 78, in compress_documents
    embedded_documents = _get_embeddings_from_stateful_docs(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_community/document_transformers/embeddings_redundant_filter.py", line 71, in _get_embeddings_from_stateful_docs
    embedded_documents = embeddings.embed_documents(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_openai/embeddings/base.py", line 591, in embed_documents
    return self._get_len_safe_embeddings(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_openai/embeddings/base.py", line 479, in _get_len_safe_embeddings
    response = self.client.create(
               ^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/resources/embeddings.py", line 132, in create
    return self._post(
           ^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1052, in request
    return self._process_response(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1141, in _process_response
    return api_response.parse()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_response.py", line 325, in parse
    parsed = self._options.post_parser(parsed)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/resources/embeddings.py", line 116, in parser
    raise ValueError("No embedding data received")
ValueError: No embedding data received
INFO:     [13:16:49] ❌ Error processing 'Summarize the paper.': No embedding data received
DEBUG:    > TEXT '{"type":"logs","content":"subquery_error","outp...eived","metadata":null}' [141 bytes]
2025-10-16 13:16:49,609 - research - ERROR - Error processing sub-query comparison of AI tools for objectively summarizing research papers October 2025: No embedding data received
Traceback (most recent call last):
  File "/home/aclifton/gpt-researcher/gpt_researcher/skills/researcher.py", line 504, in _process_sub_query
    web_context = await self.researcher.context_manager.get_similar_content_by_query(sub_query, scraped_data)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/gpt_researcher/skills/context_manager.py", line 29, in get_similar_content_by_query
    return await context_compressor.async_get_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/gpt_researcher/context/compression.py", line 76, in async_get_context
    relevant_docs = await asyncio.to_thread(compressed_docs.invoke, query, **self.kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_core/retrievers.py", line 263, in invoke
    result = self._get_relevant_documents(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain/retrievers/contextual_compression.py", line 46, in _get_relevant_documents
    compressed_docs = self.base_compressor.compress_documents(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain/retrievers/document_compressors/base.py", line 40, in compress_documents
    documents = _transformer.compress_documents(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain/retrievers/document_compressors/embeddings_filter.py", line 78, in compress_documents
    embedded_documents = _get_embeddings_from_stateful_docs(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_community/document_transformers/embeddings_redundant_filter.py", line 71, in _get_embeddings_from_stateful_docs
    embedded_documents = embeddings.embed_documents(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_openai/embeddings/base.py", line 591, in embed_documents
    return self._get_len_safe_embeddings(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/langchain_openai/embeddings/base.py", line 479, in _get_len_safe_embeddings
    response = self.client.create(
               ^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/resources/embeddings.py", line 132, in create
    return self._post(
           ^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1052, in request
    return self._process_response(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1141, in _process_response
    return api_response.parse()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/_response.py", line 325, in parse
    parsed = self._options.post_parser(parsed)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/openai/resources/embeddings.py", line 116, in parser
    raise ValueError("No embedding data received")
ValueError: No embedding data received
INFO:     [13:16:49] ❌ Error processing 'comparison of AI tools for objectively summarizing research papers October 2025': No embedding data received
DEBUG:    > TEXT '{"type":"logs","content":"subquery_error","outp...eived","metadata":null}' [200 bytes]
2025-10-16 13:16:49,612 - research - INFO - Gathered context from 4 sub-queries
INFO:     [13:16:49] Finalized research step.
💸 Total Research Costs: $0.014252200000000003
DEBUG:    > TEXT '{"type":"logs","content":"research_step_finaliz...00003","metadata":null}' [153 bytes]
2025-10-16 13:16:49,613 - research - INFO - Research completed. Context size: 2
INFO:     [13:16:49] ✍️ Writing report for 'Summarize the paper.'...
DEBUG:    > TEXT '{"type":"logs","content":"writing_report","outp...\'...","metadata":null}' [121 bytes]
2025-10-16 13:16:49,686 - httpx - INFO - HTTP Request: POST https://vllm.cessna.stratagemgroup.run/v1/chat/completions "HTTP/1.1 200 OK"
DEBUG:    % sending keepalive ping
DEBUG:    > PING b8 5c 0d bf [binary, 4 bytes]
DEBUG:    < PONG b8 5c 0d bf [binary, 4 bytes]
DEBUG:    % received keepalive pong
DEBUG:    > TEXT '{"type":"report","output":"I’m unable to provid...m the document.\\n\\n"}' [320 bytes]
DEBUG:    > TEXT '{"type":"report","output":"If you can share the...rements you outlined."}' [223 bytes]
INFO:     [13:16:56] 📝 Report written for 'Summarize the paper.'
DEBUG:    > TEXT '{"type":"logs","content":"report_written","outp...er.\'","metadata":null}' [116 bytes]
Error in converting Markdown to PDF: [Errno 2] No such file or directory: './styles/pdf_styles.css'
Report written to outputs/task_1760642175_Summarize the paper.docx
DEBUG:    > TEXT '{"type":"path","output":{"pdf":"","docx":"outpu...arize the paper.json"}}' [213 bytes]
DEBUG:    % sending keepalive ping
DEBUG:    > PING 22 cc c1 d5 [binary, 4 bytes]
DEBUG:    < PONG 22 cc c1 d5 [binary, 4 bytes]
DEBUG:    % received keepalive pong
DEBUG:    % sending keepalive ping
DEBUG:    > PING cb 6d b0 7d [binary, 4 bytes]
DEBUG:    < PONG cb 6d b0 7d [binary, 4 bytes]
DEBUG:    % received keepalive pong
^CINFO:     Shutting down
DEBUG:    ! failing connection with code 1012
DEBUG:    = connection is CLOSING
DEBUG:    > CLOSE 1012 (service restart) [2 bytes]
DEBUG:    = connection is CLOSED
DEBUG:    x half-closing TCP connection
2025-10-16 13:17:33,692 - server.server_utils - ERROR - WebSocket error: (1012, None)
Traceback (most recent call last):
  File "/home/aclifton/gpt-researcher/backend/server/server_utils.py", line 273, in handle_websocket_communication
    data = await websocket.receive_text()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/starlette/websockets.py", line 120, in receive_text
    self._raise_on_disconnect(message)
  File "/home/aclifton/gpt-researcher/.venv/lib/python3.12/site-packages/starlette/websockets.py", line 114, in _raise_on_disconnect
    raise WebSocketDisconnect(message["code"], message.get("reason"))
starlette.websockets.WebSocketDisconnect: (1012, None)

WebSocket error: (1012, None)
INFO:     connection closed
INFO:     Waiting for application shutdown.
2025-10-16 13:17:33,792 - backend.server.app - INFO - Research API shutting down
INFO:     Application shutdown complete.
INFO:     Finished server process [12853]
INFO:     Stopping reloader process [12851]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions