Per-field extraction cap to prevent truncation-based bypass


## Problem

The defense-in-depth analyzers (PR #2472) apply a single 30,000-character budget across all extracted fields. Fields are processed in order: `tool_name`, then `tool_call.name`, then `tool_call.arguments` (exec segments); `thought` entries, then `reasoning_content`, then `summary` (text segments). An oversized early field starves later fields of scanning budget, making their content invisible to all downstream analyzers.

**Source:** `openhands-sdk/openhands/sdk/security/defense_in_depth/utils.py`, lines 75-141.

### Threat model

The primary threat at the action boundary is indirect prompt injection — attacker-controlled content (documents, web pages, tool output) processed by the LLM, which then emits tool calls with side effects.

`tool_name` on `ActionEvent` has no length or content validation at any layer — it is a bare `str` field with no `max_length` constraint (see `action.py:41`, `message.py:32`). When the LLM emits a tool name not in the registry, the raw string flows through to the ActionEvent (`agent.py:766`) which is emitted via `on_event` and reaches the security analyzer before the error path runs. For non-function-calling models (via `fn_call_converter.py:80`), the extraction regex `[^>]+` allows arbitrary-length strings with no cap. OpenAI's API constrains native function call names to 64 characters, but other providers and the non-function-calling path do not. Research confirms LLMs can be coaxed into calling tool names not in the registry (Answer.AI, Jan 2026).

### Proof of concept

```
tool_name = "A" * 29990
tool_call.arguments = '{"command": "rm -rf /"}'
```

Total content is under 30,000 characters, but `arguments` receives at most 10 characters of budget. The `rm -rf /` payload is silently truncated. `PatternSecurityAnalyzer` returns `LOW`.

**Exploitability:** For models using native function calling with OpenAI (64-char name limit), the attack requires a different provider or proxy that does not enforce name length. For non-function-calling models (fn_call_converter path), the `[^>]+` regex imposes no length constraint and the attack is directly exploitable via prompt injection. The ActionEvent is emitted and analyzed before the tool-not-found error fires.

### Decision rationale

Per-field cap is the simplest fix that removes the structural weakness without changing the overall budget or any downstream behavior. The alternative — validating `tool_name` length upstream — would also be correct but is a separate concern (input validation vs. extraction resilience). The 50% cap ensures that even if an upstream validation gap is discovered later, the extraction pipeline does not have a structural weakness. The heuristic guarantees that if one pre-argument field is adversarially large, `arguments` still gets at least 15,000 characters. If both `tool_name` and `tool_call.name` simultaneously hit the cap, `arguments` would get 0 from the global budget — but this requires two adversarially large fields, which is harder to achieve than one.

## Proposed fix

In `utils.py`, add `_FIELD_CAP = _EXTRACT_HARD_CAP // 2` and enforce it in the `_add` helper in both `_extract_exec_segments()` and `_extract_text_segments()`:

```python
_FIELD_CAP = _EXTRACT_HARD_CAP // 2  # No single field > 50% of total budget

def _add(text: str, field_cap: int = _FIELD_CAP) -> None:
    nonlocal total
    remaining = min(_EXTRACT_HARD_CAP - total, field_cap)
    if remaining <= 0:
        return
    if len(text) > remaining:
        text = text[:remaining]
    segments.append(text)
    total += len(text)
```

Approximately 5 lines per extraction function. No changes to normalization or pattern matching.

### Test plan

- Oversized `tool_name` does not starve `tool_call.arguments`
- Oversized `tool_call.name` does not starve `tool_call.arguments`
- Oversized `thought` entries do not starve `summary`
- Total cap still honored when all fields are large
- Small fields extracted in full (no unnecessary truncation)
- End-to-end: oversized `tool_name` + malicious `arguments` returns `HIGH`
- Update `test_payload_past_hard_cap` xfail (field-starvation variant should now pass; genuine total-cap exceedance remains documented)

### Context

Identified in adversarial security review (2026-04-03) of PR #2472. Priority 1 in the red-team fix list.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Per-field extraction cap to prevent truncation-based bypass #2706

Problem

Threat model

Proof of concept

Decision rationale

Proposed fix

Test plan

Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Per-field extraction cap to prevent truncation-based bypass #2706

Description

Problem

Threat model

Proof of concept

Decision rationale

Proposed fix

Test plan

Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions