Skip to content
This repository was archived by the owner on Nov 29, 2025. It is now read-only.
This repository was archived by the owner on Nov 29, 2025. It is now read-only.

Enhance conversation pruning & tool compression #90

@westonbrown

Description

@westonbrown

Why this feature?

Our pipeline compresses large tool results to stay within the context budgets of GPT-5, Moonshot, Polaris, and smaller LiteLLM models. Unlike the Strands SDK implementation (strands-agents/sdk-python#766 / strands-agents/docs#229), we have additional constraints: ToolRouterHook artifact offloads, Mem0 integration, and report generation all need consistent summaries. We need to finish the polish so pruning works seamlessly across the stack.

Goals

  • Surface the env knobs (, , , ) in Quick Start/docs so operators can tune compression per provider/model.
  • Emit structured logs/events when compression fires (original vs compressed size, thresholds) for observability.
  • Ensure ToolRouterHook, Mem0, and report builder all see the same trimmed content (no duplicate banners or mismatched artifacts).

Proposal

  1. Update + config samples with the pruning flow/env variables.
  2. Add CYBER_EVENT/log output whenever a tool result is compressed, including before/after sizes.
  3. Audit ToolRouterHook + report builder so artifact offloads and compression banners remain consistent.

Testing

  • Extend for JSON summarization, env overrides, and logging hooks.
  • Run ops across different LiteLLM providers to confirm logs/metrics capture compression events and reports stay within context limits.
  • Verify tool-router artifacts, Mem0, and report generation ingest the compressed text correctly.

References

Metadata

Metadata

Assignees

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions