Description
When using input_files with PDFFile (or File), CrewAI does not appear to handle the file as a native file input at the provider level.
Instead, the file is processed via the read_file tool and its content is returned as a binary/base64 representation. This content is then indirectly injected into the agent execution context.
As a result:
- The PDF is effectively treated as inline binary data (base64)
- The LLM context becomes extremely large
- Responses become inconsistent or fail due to context overflow
- The same file is re-processed during agent execution via tools
This makes PDFFile unreliable for large or even medium-sized documents.
Steps to Reproduce
- Create a minimal CrewAI setup:
from crewai import Agent, Task, Crew
from crewai_files import PDFFile
agent = Agent(
role="Document Analyst",
goal="Extract structured information from PDFs",
backstory="Expert in document analysis",
llm="gpt-4o-mini",
)
task = Task(
description="""
Read the PDF document {doc}
Extract the main sections and summarize them precisely.
""",
expected_output="Structured list of sections",
agent=agent,
)
crew = Crew(agents=[agent], tasks=[task], verbose=True)
result = crew.kickoff(
input_files={
"doc": PDFFile(source="./src/test_crewai_files/pdfs/sample.pdf")
}
)
print(result)
- Run the flow:
crewai run
- Observe agent execution logs with verbose=True
Expected behavior
The PDF should be:
- either streamed or parsed externally before being sent to the LLM
- or converted into structured text chunks (not raw base64)
The agent should receive:
- structured text
- or extracted segments
- NOT a raw base64 PDF representation
Screenshots/Code snippets
from crewai import Agent, Task, Crew
from crewai_files import PDFFile
agent = Agent(
role="Document Analyst",
goal="Extract structured information from PDFs",
backstory="Expert in document analysis",
llm="gpt-4o-mini",
)
task = Task(
description="""
Read the PDF document {doc}
Extract the main sections and summarize them precisely.
""",
expected_output="Structured list of sections",
agent=agent,
)
crew = Crew(agents=[agent], tasks=[task], verbose=True)
result = crew.kickoff(
input_files={
"doc": PDFFile(source="./src/test_crewai_files/pdfs/sample.pdf")
}
)
print(result)
Operating System
Ubuntu 24.04
Python Version
3.12
crewAI Version
v.1.14.4
crewAI Tools Version
v.1.14.4
Virtual Environment
Venv
Evidence
Verbose output evidence
Tool Execution Started (#3)
Tool: read_file
Args: {'file_name': 'doc'}
Tool Execution Completed (#3)
Tool Completed
Tool: read_file
Output: [Binary file: sample.pdf (application/pdf)]
Base64:
...
Possible Solution
None
Additional context
This issue blocks any production usage of PDF ingestion in multi-step CrewAI flows, because:
- context size grows linearly with file size
- multiple tasks re-trigger file expansion
- sequential workflows amplify token explosion
A safer architecture would:
- load file once
- extract structured representation once
- reuse extracted representation across tasks without re-injecting raw binary
Description
When using
input_fileswithPDFFile(orFile), CrewAI does not appear to handle the file as a native file input at the provider level.Instead, the file is processed via the
read_filetool and its content is returned as a binary/base64 representation. This content is then indirectly injected into the agent execution context.As a result:
This makes
PDFFileunreliable for large or even medium-sized documents.Steps to Reproduce
crewai run
Expected behavior
The PDF should be:
The agent should receive:
Screenshots/Code snippets
Operating System
Ubuntu 24.04
Python Version
3.12
crewAI Version
v.1.14.4
crewAI Tools Version
v.1.14.4
Virtual Environment
Venv
Evidence
Verbose output evidence
Possible Solution
None
Additional context
This issue blocks any production usage of PDF ingestion in multi-step CrewAI flows, because:
A safer architecture would: