[BUG] input_files (PDFFile) are passed as base64 via read_file tool, causing context overflow and inconsistent LLM behavior

### Description

When using `input_files` with `PDFFile` (or `File`), CrewAI does not appear to handle the file as a native file input at the provider level.

Instead, the file is processed via the `read_file` tool and its content is returned as a binary/base64 representation. This content is then indirectly injected into the agent execution context.

As a result:
- The PDF is effectively treated as inline binary data (base64)
- The LLM context becomes extremely large
- Responses become inconsistent or fail due to context overflow
- The same file is re-processed during agent execution via tools

This makes `PDFFile` unreliable for large or even medium-sized documents.

### Steps to Reproduce

1. Create a minimal CrewAI setup:

```python
from crewai import Agent, Task, Crew
from crewai_files import PDFFile

agent = Agent(
    role="Document Analyst",
    goal="Extract structured information from PDFs",
    backstory="Expert in document analysis",
    llm="gpt-4o-mini",
)

task = Task(
    description="""
    Read the PDF document {doc}
    Extract the main sections and summarize them precisely.
    """,
    expected_output="Structured list of sections",
    agent=agent,
)

crew = Crew(agents=[agent], tasks=[task], verbose=True)

result = crew.kickoff(
    input_files={
        "doc": PDFFile(source="./src/test_crewai_files/pdfs/sample.pdf")
    }
)

print(result)
```
2. Run the flow:
crewai run
3. Observe agent execution logs with verbose=True

### Expected behavior

The PDF should be:
- either streamed or parsed externally before being sent to the LLM
- or converted into structured text chunks (not raw base64)

The agent should receive:
- structured text
- or extracted segments
- NOT a raw base64 PDF representation

### Screenshots/Code snippets

```python
from crewai import Agent, Task, Crew
from crewai_files import PDFFile

agent = Agent(
    role="Document Analyst",
    goal="Extract structured information from PDFs",
    backstory="Expert in document analysis",
    llm="gpt-4o-mini",
)

task = Task(
    description="""
    Read the PDF document {doc}
    Extract the main sections and summarize them precisely.
    """,
    expected_output="Structured list of sections",
    agent=agent,
)

crew = Crew(agents=[agent], tasks=[task], verbose=True)

result = crew.kickoff(
    input_files={
        "doc": PDFFile(source="./src/test_crewai_files/pdfs/sample.pdf")
    }
)

print(result)
```

### Operating System

Ubuntu 24.04

### Python Version

3.12

### crewAI Version

v.1.14.4

### crewAI Tools Version

v.1.14.4

### Virtual Environment

Venv

### Evidence

## Verbose output evidence
```bash
Tool Execution Started (#3)

Tool: read_file
Args: {'file_name': 'doc'}

Tool Execution Completed (#3)

Tool Completed
Tool: read_file
Output: [Binary file: sample.pdf (application/pdf)]
Base64:
...
```

### Possible Solution

None

### Additional context

This issue blocks any production usage of PDF ingestion in multi-step CrewAI flows, because:

- context size grows linearly with file size
- multiple tasks re-trigger file expansion
- sequential workflows amplify token explosion

A safer architecture would:

- load file once
- extract structured representation once
- reuse extracted representation across tasks without re-injecting raw binary

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] input_files (PDFFile) are passed as base64 via read_file tool, causing context overflow and inconsistent LLM behavior #5930

Description

Steps to Reproduce

Expected behavior

Screenshots/Code snippets

Operating System

Python Version

crewAI Version

crewAI Tools Version

Virtual Environment

Evidence

Verbose output evidence

Possible Solution

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[BUG] input_files (PDFFile) are passed as base64 via read_file tool, causing context overflow and inconsistent LLM behavior #5930

Description

Description

Steps to Reproduce

Expected behavior

Screenshots/Code snippets

Operating System

Python Version

crewAI Version

crewAI Tools Version

Virtual Environment

Evidence

Verbose output evidence

Possible Solution

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions