This repository was archived by the owner on Nov 29, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 110
This repository was archived by the owner on Nov 29, 2025. It is now read-only.
Enhance conversation pruning & tool compression #90
Copy link
Copy link
Closed
Labels
enhancementNew feature or requestNew feature or requestpriority: mediumNormal priorityNormal prioritysize: M1-2 days1-2 days
Milestone
Description
Why this feature?
Our pipeline compresses large tool results to stay within the context budgets of GPT-5, Moonshot, Polaris, and smaller LiteLLM models. Unlike the Strands SDK implementation (strands-agents/sdk-python#766 / strands-agents/docs#229), we have additional constraints: ToolRouterHook artifact offloads, Mem0 integration, and report generation all need consistent summaries. We need to finish the polish so pruning works seamlessly across the stack.
Goals
- Surface the env knobs (, , , ) in Quick Start/docs so operators can tune compression per provider/model.
- Emit structured logs/events when compression fires (original vs compressed size, thresholds) for observability.
- Ensure ToolRouterHook, Mem0, and report builder all see the same trimmed content (no duplicate banners or mismatched artifacts).
Proposal
- Update + config samples with the pruning flow/env variables.
- Add CYBER_EVENT/log output whenever a tool result is compressed, including before/after sizes.
- Audit ToolRouterHook + report builder so artifact offloads and compression banners remain consistent.
Testing
- Extend for JSON summarization, env overrides, and logging hooks.
- Run ops across different LiteLLM providers to confirm logs/metrics capture compression events and reports stay within context limits.
- Verify tool-router artifacts, Mem0, and report generation ingest the compressed text correctly.
References
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestpriority: mediumNormal priorityNormal prioritysize: M1-2 days1-2 days