Skip to content

feat(formatter): compact agent mode output (~91% token reduction)#188

Draft
platinummonkey wants to merge 22 commits intomainfrom
feat/compact-agent-output
Draft

feat(formatter): compact agent mode output (~91% token reduction)#188
platinummonkey wants to merge 22 commits intomainfrom
feat/compact-agent-output

Conversation

@platinummonkey
Copy link
Copy Markdown
Collaborator

@platinummonkey platinummonkey commented Mar 11, 2026

Summary

Adds opt-in compressed output for agent mode, porting the JSON compression technique from rtk-ai/rtk (MIT). When enabled, the `{status, data, metadata}` agent envelope is returned with data compressed: nulls stripped, long strings truncated, arrays sampled, and fields selected by a per-command token budget. The LLM receives real, actionable values — not type descriptors.

Measured on a 200-monitor list response: 88,950 → 7,595 tokens (91% reduction).

Changes

Core compression (src/rtk.rs)

  • New module: compress_json_string (new) + filter_json_string ported from rtk-ai/rtk (MIT, Patrick Szymkowiak)
  • Compression pipeline: null stripping → string truncation → array sampling
  • CompressConfig struct with configurable limits (replaces module-level constants)

Configuration (src/config.rs, src/main.rs)

  • --agent-compact global flag + AGENT_COMPACT_MODE=1 env var
  • Tunable limits via config file or env vars:
    • compact_string_trunc / AGENT_COMPACT_STRING_TRUNC (default 200 chars)
    • compact_array_top / AGENT_COMPACT_ARRAY_TOP (default 20 items)
    • compact_array_nested / AGENT_COMPACT_ARRAY_NESTED (default 10 items)
    • compact_item_budget / AGENT_COMPACT_ITEM_BUDGET / --agent-budget (default 150 tokens)

Token-budget field selection (src/formatter.rs, src/rtk.rs)

  • FieldWeights (importance scores 0.0–1.0 per field) + greedy token-budget allocation
  • Must-have fields (≥0.9) are truncated to fit rather than dropped; zero-weight fields always dropped
  • Budget operating points: ~100 = id/name/status only; ~150 = MCP-like density (default); ~300 = relaxed
  • Per-command weight profiles: MONITOR_WEIGHTS, LOG_WEIGHTS, SPAN_WEIGHTS, INCIDENT_WEIGHTS, EVENT_WEIGHTS, DASHBOARD_WEIGHTS

Per-command structural flattens

  • Monitors: drops options, creator, created*, matching_downtimes, org_id; keeps id/name/state/type/query/tags/message/modified
  • Logs: lifts attributes.attributes to top level; flat {id, timestamp, message, service, status, host, tags}
  • Traces: flattens attributes, drops verbose custom bag
  • Dashboards: extracts dashboards[] array from API wrapper; id/title/url/description
  • Incidents: lifts attributes.{title,severity,state,created,commander,customer_impacted} + direct url (https://app.datadoghq.com/incidents/); handles both list (search envelope) and get response shapes
  • Metrics: transforms raw 180-point timeseries → {overall_stats, binned[20], trend, url} (matches MCP format); bypasses field-weight selection (structural flatten only)
  • Events: lifts nested attributes.attributes.{title,service} to top level; drops _dd internal metadata

Format coverage

  • Compression pipeline (flatten → token-budget) runs for all output formats in agent mode (JSON, YAML, CSV, table, TSV) — not just JSON

Bug fixes

  • fix(wasm): declare mod rtk in lib.rs for browser WASM build
  • fix(wasm): add chrono to browser feature deps (used by ms_to_iso in rtk.rs)
  • fix(formatter): remove METRIC_WEIGHTS from routing — field-weight selection was silently dropping the entire series object; structural flatten only
  • fix(incidents): unwrap search_incidents double envelope (data.attributes.incidents[].data) before flattening
  • fix(workflows): add missing compress_cfg arg to format_and_print

Compression breakdown (200-monitor list)

Technique Tokens saved
Array sampling (200 → 20 items + "... +180 more") ~81,368
Null stripping (deleted, priority, restricted_roles, …) ~9,508
String truncation (messages, queries > 200 chars) ~3,758

Usage

# Via env var (set in shell profile or Claude Code hooks)
AGENT_COMPACT_MODE=1 pup monitors list

# Via flag
pup --agent-compact monitors list

# Tune token budget per item (default 150)
pup --agent-compact --agent-budget 100 monitors list   # id/name/status only
pup --agent-compact --agent-budget 300 monitors list   # most fields survive

Testing

  • 14 unit tests in src/rtk.rs covering compress and schema functions
  • All existing tests pass
  • cargo clippy -- -D warnings clean

🤖 Generated with Claude Code

platinummonkey and others added 20 commits March 11, 2026 12:21
Ports the JSON compression technique from rtk-ai/rtk (MIT) to reduce
LLM token consumption by ~91% on large list responses.

When --agent-compact or AGENT_COMPACT_MODE=1 is set, agent mode output
is compressed before being wrapped in the {status, data, metadata}
envelope:
- Null fields are stripped (large win on Datadog responses)
- Strings longer than 200 chars are truncated with a [N chars] annotation
- Top-level arrays are sampled to 20 items (+ "... +N more" sentinel)
- Nested arrays are sampled to 10 items

Measured on a 200-monitor list: 88,950 → 7,595 tokens (91% reduction).
The dominant saving is array sampling (200→20 items); null stripping
adds ~9,500 tokens on top.

- src/rtk.rs: new module — compress_json_string (new) + filter_json_string
  ported verbatim from rtk-ai/rtk json_cmd.rs (MIT, Patrick Szymkowiak)
- src/config.rs: add compact_mode field, reads AGENT_COMPACT_MODE env var
- src/main.rs: add --agent-compact global flag, wires into cfg.compact_mode
- src/formatter.rs: format_and_print gains compact_mode param; compression
  is a no-op unless compact_mode=true; falls back to full data when
  compressed form would be larger (tiny payloads)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Exposes compact mode and its three tuning parameters through both the
config file (~/.config/pup/config.yaml) and environment variables.

Config file keys:
  compact_mode: true              # also: AGENT_COMPACT_MODE=1
  compact_string_trunc: 200       # also: AGENT_COMPACT_STRING_TRUNC=N
  compact_array_top: 20           # also: AGENT_COMPACT_ARRAY_TOP=N
  compact_array_nested: 10        # also: AGENT_COMPACT_ARRAY_NESTED=N

Replaces the module-level constants in rtk.rs with a CompressConfig
struct (with Default impl) that is built from Config at call time and
threaded through format_and_print as Option<&CompressConfig> (None =
compact off, Some = compact on with those settings).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
formatter.rs imports crate::rtk but lib.rs (the browser WASM crate
root) did not declare the module, causing a compile failure.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
When compact mode is enabled, each command can now apply a field
projector that keeps only the fields an agent needs, matching the
pre-filtered output style of the Datadog MCP server.

Projectors run before null-stripping and string truncation:
- monitors list/get: strips options object (avalanche_window, locked,
  renotify_interval, etc.), org_id, multi, matching_downtimes, draft_status —
  keeps id, name, overall_state, type, query, message, tags, creator, modified
- logs search: flattens nested attributes wrapper to a flat object —
  keeps id, timestamp, message, service, status, host, tags
- traces search/aggregate: flattens attributes, drops verbose custom bag —
  keeps id, trace_id, service, operation_name, resource_name, status,
  start/end_timestamp, env, host, error_type

Architecture:
- CompressConfig gains `project: Option<fn(&Value) -> Value>`
- compress_json_string applies the projector to each top-level item
  before compression
- compress_cfg_from(cfg, command) wires the right projector via
  projection_for_command(command)
- Command files pass their meta.command into compress_cfg_from

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ld selection

Instead of hardcoded field allowlists, each command now declares FieldWeights
(importance scores 0.0–1.0 per field) and the algorithm fills a per-item token
budget greedily by value density (importance / token_cost).

Key properties:
- Must-have fields (≥0.9) are truncated to fit rather than dropped
- Zero-weight fields are always dropped regardless of budget
- Unlisted fields get a default_weight (auto-handles future API additions)
- Small low-importance fields survive when budget has room; large ones don't
- Budget adapts to actual data sizes — a tiny options object can survive;
  a 2KB one is dropped without any code change

Weight profiles added:
- MONITOR_WEIGHTS: options=0.05, matching_downtimes=0.02, id/name/state=1.0
- LOG_WEIGHTS: timestamp/message/service/status=1.0, tags=0.5
- SPAN_WEIGHTS: service/status/resource_name/error_type=1.0, custom bag dropped
- INCIDENT_WEIGHTS: id/title/severity/state/created=1.0, fields schema=0.02
- EVENT_WEIGHTS: title/timestamp=1.0, _dd internal bag=0.02

Architecture changes:
- CompressConfig: project → flatten (structural only) + field_weights (token budget)
- Added per_item_token_budget: usize (default 300 tokens ≈ 1200 chars)
- compress_value applies token_budget_compress_object at depth ≤ 1 (item level)
- weights_for_command() and flatten_for_command() replace projection_for_command()

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
300 tokens was too generous — monitors fit their entire options object
within budget. At 150 tokens (~600 chars) the bulky options object (~61
tokens) can't fit after high-importance fields consume ~110 tokens,
while cheap low-importance fields (org_id=1t, multi=2t) still survive.

Adds inline documentation explaining the three budget operating points:
  ~100 tokens → only must-have fields
  ~150 tokens → MCP-like density (default)
  ~300 tokens → relaxed, most fields survive

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
… config

The token budget controls compression aggressiveness in compact agent mode.
Higher budget = more fields survive per item; lower = only high-importance
fields. Exposed through all three config layers:

  --agent-budget 100          # CLI flag (highest priority)
  AGENT_COMPACT_ITEM_BUDGET=100  # env var
  compact_item_budget: 100    # ~/.config/pup/config.yaml

Operating points:
  ~100 tokens → only must-have fields (id, name, status, type)
  ~150 tokens → MCP-like density (default; options/org_id dropped)
  ~300 tokens → relaxed (most fields survive; only very bulky ones dropped)

The field importance ratios in FieldWeights still determine which fields
are dropped first when the budget is tight — the budget is the dial,
the weights are the relative priorities.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…cs, events

Extends compact agent mode coverage to four more domains. Each gets a
structural flatten function (lifts nested attributes to top level) and
FieldWeights for token-budget selection, wired via output_cmd().

Dashboards (flatten_dashboards + DASHBOARD_WEIGHTS):
  Extracts the dashboards[] array from the API wrapper so token-budget
  compression applies to individual items: id=1.0, title=1.0, url=0.8,
  description=0.6. No longer returns the entire list unfiltered.

Incidents (flatten_incident + updated INCIDENT_WEIGHTS):
  Lifts attributes.{title, severity, state, created, commander,
  customer_impacted} to top level. Drops the verbose `fields` schema
  bag (multiselect/dropdown type metadata with no values).

Metrics (flatten_metric + METRIC_WEIGHTS):
  Transforms the raw 180-point timeseries into a summary: min/max/avg,
  trend (rising/falling/stable), 10 evenly-sampled values, ISO
  timestamps. Similar to MCP's binned stats approach.

Events (flatten_event + updated EVENT_WEIGHTS):
  Lifts outer attributes.{timestamp, message, tags} and inner
  attributes.attributes.{title, service} to top level. Drops _dd
  internal metadata bag.

All four commands now call formatter::output_cmd(cfg, &resp, "command")
instead of formatter::output(cfg, &resp) so weights/flatten are applied.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Previously flatten + token-budget only applied when agent_mode && JSON.
Switching to --output yaml/csv/table in agent mode returned full raw
data, bypassing the structural flatten (attributes nesting, etc.) and
field-weight selection entirely.

Now the sort → data-key hoist → flatten → token-budget pipeline runs
for all formats when in agent mode. The only difference is that JSON
wraps the result in the {status, data, metadata} envelope while
YAML/CSV/table output the compressed data directly.

This means `pup --output yaml logs search` in agent mode returns the
flat {timestamp, message, service, status, host} structure instead of
the raw attributes.attributes nesting, and `--output csv monitors list`
omits the options object just as JSON compact mode does.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
… binning

Monitor weights:
- Drop creator (0.0): expensive ~30-token object, rarely needed for triage
- Drop created_at (0.0): Unix ms duplicate of created, both superseded by modified
- Drop created (0.0): ISO timestamp, modified is more actionable for triage
- monitors now show: id/name/overall_state/type/query/tags/message/notifications/modified

Metrics (flatten_metric / summarise_series):
- Add overall_stats: {count, min, max, avg, sum} — matches MCP format
- Add binned: 20 ordered time buckets with ISO start_time, count, min, max, avg
- Add url: human-readable link (https://app.datadoghq.com/metric/explorer?...)
- Keep trend indicator (rising/falling/stable) from first vs last 10% comparison
- Remove json_f64 helper, now unused after switching to serde_json::json!()

Re: item 3 (limit vs token budget): the --limit CLI flag controls API fetch size
(default 200 for monitors). The token budget controls fields per item; array_items_top
(default 20) controls how many items appear in compact output. These are independent —
never use --limit 5 in agent compact mode; let array_items_top do the display limiting.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ies output

The 20-bin series object produced by flatten_metric costs ~250 tokens —
exceeding the 150-token per-item budget even though series has importance
1.0. The current code drops must-have objects that can't be string-truncated,
so the entire series was silently absent from metric output.

flatten_metric already produces the right compact structure (overall_stats,
20 time bins, trend, url). Field-weight selection adds no value here and
actively breaks the output. Route metrics query to None for weights so
only the structural flatten runs, matching MCP's format.

METRIC_WEIGHTS is kept with #[allow(dead_code)] for future reference if
a per-field budget approach is revisited.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Brings in:
- feat/tsv-output-formatter (PR #189): OutputFormat::Tsv, print_tsv
- feat/incidents-default-active (PR #190): incidents list now defaults
  to state:active, sorted by most recent

Merge fixes:
- Add OutputFormat::Tsv arm to agent mode format dispatch in format_and_print
- Update TSV test call to match 5-arg format_and_print signature

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
search_incidents returns a different shape than list_incidents:
  list_incidents:    response.data[] → each item is IncidentResponseData
  search_incidents:  response.data.attributes.incidents[].data → same type

The previous flatten_incident was called on the outer data object
(type/attributes wrapper) instead of each incident, producing empty output.

Add flatten_incidents_search that:
  1. Extracts effective_data.attributes.incidents[]
  2. Unwraps each item's .data field (IncidentSearchResponseIncidentsData.data)
  3. Applies flatten_incident to each IncidentResponseData

Wire "incidents list" → flatten_incidents_search (search endpoint)
Keep "incidents get"  → flatten_incident (get endpoint, no extra wrapper)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Constructs https://app.datadoghq.com/incidents/<id> in flatten_incident
so agents can link directly to incidents without a separate lookup.
Also adds url to INCIDENT_WEIGHTS at 1.0 (must-have) so it always
survives the token budget.

Closes the last meaningful gap between pup and MCP for incidents:
pup now shows commander, severity, state, created, customer_impacted,
and a direct URL — matching MCP's TSV output.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
format_and_print gained a compress_cfg parameter as part of the compact
agent output feature, but the workflows instance_list call site was not
updated, causing compilation failures across all CI jobs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
rtk.rs uses chrono for ms_to_iso timestamp formatting and is included
in the browser WASM build, but chrono was missing from the browser
feature's dependency list, causing compilation failures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@platinummonkey platinummonkey added enhancement New feature or request usability labels Mar 24, 2026
platinummonkey and others added 2 commits March 28, 2026 09:20
…erge compilation

Add URL construction to all remaining flatten functions:
- flatten_span: https://app.datadoghq.com/apm/trace/<trace_id>
- flatten_event: https://app.datadoghq.com/event/event?id=<id>
- flatten_monitor (new): https://app.datadoghq.com/monitors/<id>

Wire flatten_monitor into flatten_for_command for "monitors list" and
"monitors get". All three weights tables gain url at 1.0 (must-have).

Also fix post-merge build failures:
- idp.rs: format_and_print calls missing the new compress_cfg argument
- test_commands.rs: Config initializers missing compact_* fields

All 536 unit tests pass.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request usability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant