feat(formatter): compact agent mode output (~91% token reduction)#188
Draft
platinummonkey wants to merge 22 commits intomainfrom
Draft
feat(formatter): compact agent mode output (~91% token reduction)#188platinummonkey wants to merge 22 commits intomainfrom
platinummonkey wants to merge 22 commits intomainfrom
Conversation
Ports the JSON compression technique from rtk-ai/rtk (MIT) to reduce
LLM token consumption by ~91% on large list responses.
When --agent-compact or AGENT_COMPACT_MODE=1 is set, agent mode output
is compressed before being wrapped in the {status, data, metadata}
envelope:
- Null fields are stripped (large win on Datadog responses)
- Strings longer than 200 chars are truncated with a [N chars] annotation
- Top-level arrays are sampled to 20 items (+ "... +N more" sentinel)
- Nested arrays are sampled to 10 items
Measured on a 200-monitor list: 88,950 → 7,595 tokens (91% reduction).
The dominant saving is array sampling (200→20 items); null stripping
adds ~9,500 tokens on top.
- src/rtk.rs: new module — compress_json_string (new) + filter_json_string
ported verbatim from rtk-ai/rtk json_cmd.rs (MIT, Patrick Szymkowiak)
- src/config.rs: add compact_mode field, reads AGENT_COMPACT_MODE env var
- src/main.rs: add --agent-compact global flag, wires into cfg.compact_mode
- src/formatter.rs: format_and_print gains compact_mode param; compression
is a no-op unless compact_mode=true; falls back to full data when
compressed form would be larger (tiny payloads)
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Exposes compact mode and its three tuning parameters through both the config file (~/.config/pup/config.yaml) and environment variables. Config file keys: compact_mode: true # also: AGENT_COMPACT_MODE=1 compact_string_trunc: 200 # also: AGENT_COMPACT_STRING_TRUNC=N compact_array_top: 20 # also: AGENT_COMPACT_ARRAY_TOP=N compact_array_nested: 10 # also: AGENT_COMPACT_ARRAY_NESTED=N Replaces the module-level constants in rtk.rs with a CompressConfig struct (with Default impl) that is built from Config at call time and threaded through format_and_print as Option<&CompressConfig> (None = compact off, Some = compact on with those settings). Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
formatter.rs imports crate::rtk but lib.rs (the browser WASM crate root) did not declare the module, causing a compile failure. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
When compact mode is enabled, each command can now apply a field projector that keeps only the fields an agent needs, matching the pre-filtered output style of the Datadog MCP server. Projectors run before null-stripping and string truncation: - monitors list/get: strips options object (avalanche_window, locked, renotify_interval, etc.), org_id, multi, matching_downtimes, draft_status — keeps id, name, overall_state, type, query, message, tags, creator, modified - logs search: flattens nested attributes wrapper to a flat object — keeps id, timestamp, message, service, status, host, tags - traces search/aggregate: flattens attributes, drops verbose custom bag — keeps id, trace_id, service, operation_name, resource_name, status, start/end_timestamp, env, host, error_type Architecture: - CompressConfig gains `project: Option<fn(&Value) -> Value>` - compress_json_string applies the projector to each top-level item before compression - compress_cfg_from(cfg, command) wires the right projector via projection_for_command(command) - Command files pass their meta.command into compress_cfg_from Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ld selection Instead of hardcoded field allowlists, each command now declares FieldWeights (importance scores 0.0–1.0 per field) and the algorithm fills a per-item token budget greedily by value density (importance / token_cost). Key properties: - Must-have fields (≥0.9) are truncated to fit rather than dropped - Zero-weight fields are always dropped regardless of budget - Unlisted fields get a default_weight (auto-handles future API additions) - Small low-importance fields survive when budget has room; large ones don't - Budget adapts to actual data sizes — a tiny options object can survive; a 2KB one is dropped without any code change Weight profiles added: - MONITOR_WEIGHTS: options=0.05, matching_downtimes=0.02, id/name/state=1.0 - LOG_WEIGHTS: timestamp/message/service/status=1.0, tags=0.5 - SPAN_WEIGHTS: service/status/resource_name/error_type=1.0, custom bag dropped - INCIDENT_WEIGHTS: id/title/severity/state/created=1.0, fields schema=0.02 - EVENT_WEIGHTS: title/timestamp=1.0, _dd internal bag=0.02 Architecture changes: - CompressConfig: project → flatten (structural only) + field_weights (token budget) - Added per_item_token_budget: usize (default 300 tokens ≈ 1200 chars) - compress_value applies token_budget_compress_object at depth ≤ 1 (item level) - weights_for_command() and flatten_for_command() replace projection_for_command() Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
300 tokens was too generous — monitors fit their entire options object within budget. At 150 tokens (~600 chars) the bulky options object (~61 tokens) can't fit after high-importance fields consume ~110 tokens, while cheap low-importance fields (org_id=1t, multi=2t) still survive. Adds inline documentation explaining the three budget operating points: ~100 tokens → only must-have fields ~150 tokens → MCP-like density (default) ~300 tokens → relaxed, most fields survive Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
… config The token budget controls compression aggressiveness in compact agent mode. Higher budget = more fields survive per item; lower = only high-importance fields. Exposed through all three config layers: --agent-budget 100 # CLI flag (highest priority) AGENT_COMPACT_ITEM_BUDGET=100 # env var compact_item_budget: 100 # ~/.config/pup/config.yaml Operating points: ~100 tokens → only must-have fields (id, name, status, type) ~150 tokens → MCP-like density (default; options/org_id dropped) ~300 tokens → relaxed (most fields survive; only very bulky ones dropped) The field importance ratios in FieldWeights still determine which fields are dropped first when the budget is tight — the budget is the dial, the weights are the relative priorities. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…cs, events
Extends compact agent mode coverage to four more domains. Each gets a
structural flatten function (lifts nested attributes to top level) and
FieldWeights for token-budget selection, wired via output_cmd().
Dashboards (flatten_dashboards + DASHBOARD_WEIGHTS):
Extracts the dashboards[] array from the API wrapper so token-budget
compression applies to individual items: id=1.0, title=1.0, url=0.8,
description=0.6. No longer returns the entire list unfiltered.
Incidents (flatten_incident + updated INCIDENT_WEIGHTS):
Lifts attributes.{title, severity, state, created, commander,
customer_impacted} to top level. Drops the verbose `fields` schema
bag (multiselect/dropdown type metadata with no values).
Metrics (flatten_metric + METRIC_WEIGHTS):
Transforms the raw 180-point timeseries into a summary: min/max/avg,
trend (rising/falling/stable), 10 evenly-sampled values, ISO
timestamps. Similar to MCP's binned stats approach.
Events (flatten_event + updated EVENT_WEIGHTS):
Lifts outer attributes.{timestamp, message, tags} and inner
attributes.attributes.{title, service} to top level. Drops _dd
internal metadata bag.
All four commands now call formatter::output_cmd(cfg, &resp, "command")
instead of formatter::output(cfg, &resp) so weights/flatten are applied.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Previously flatten + token-budget only applied when agent_mode && JSON.
Switching to --output yaml/csv/table in agent mode returned full raw
data, bypassing the structural flatten (attributes nesting, etc.) and
field-weight selection entirely.
Now the sort → data-key hoist → flatten → token-budget pipeline runs
for all formats when in agent mode. The only difference is that JSON
wraps the result in the {status, data, metadata} envelope while
YAML/CSV/table output the compressed data directly.
This means `pup --output yaml logs search` in agent mode returns the
flat {timestamp, message, service, status, host} structure instead of
the raw attributes.attributes nesting, and `--output csv monitors list`
omits the options object just as JSON compact mode does.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
… binning
Monitor weights:
- Drop creator (0.0): expensive ~30-token object, rarely needed for triage
- Drop created_at (0.0): Unix ms duplicate of created, both superseded by modified
- Drop created (0.0): ISO timestamp, modified is more actionable for triage
- monitors now show: id/name/overall_state/type/query/tags/message/notifications/modified
Metrics (flatten_metric / summarise_series):
- Add overall_stats: {count, min, max, avg, sum} — matches MCP format
- Add binned: 20 ordered time buckets with ISO start_time, count, min, max, avg
- Add url: human-readable link (https://app.datadoghq.com/metric/explorer?...)
- Keep trend indicator (rising/falling/stable) from first vs last 10% comparison
- Remove json_f64 helper, now unused after switching to serde_json::json!()
Re: item 3 (limit vs token budget): the --limit CLI flag controls API fetch size
(default 200 for monitors). The token budget controls fields per item; array_items_top
(default 20) controls how many items appear in compact output. These are independent —
never use --limit 5 in agent compact mode; let array_items_top do the display limiting.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ies output The 20-bin series object produced by flatten_metric costs ~250 tokens — exceeding the 150-token per-item budget even though series has importance 1.0. The current code drops must-have objects that can't be string-truncated, so the entire series was silently absent from metric output. flatten_metric already produces the right compact structure (overall_stats, 20 time bins, trend, url). Field-weight selection adds no value here and actively breaks the output. Route metrics query to None for weights so only the structural flatten runs, matching MCP's format. METRIC_WEIGHTS is kept with #[allow(dead_code)] for future reference if a per-field budget approach is revisited. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Brings in: - feat/tsv-output-formatter (PR #189): OutputFormat::Tsv, print_tsv - feat/incidents-default-active (PR #190): incidents list now defaults to state:active, sorted by most recent Merge fixes: - Add OutputFormat::Tsv arm to agent mode format dispatch in format_and_print - Update TSV test call to match 5-arg format_and_print signature Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
search_incidents returns a different shape than list_incidents: list_incidents: response.data[] → each item is IncidentResponseData search_incidents: response.data.attributes.incidents[].data → same type The previous flatten_incident was called on the outer data object (type/attributes wrapper) instead of each incident, producing empty output. Add flatten_incidents_search that: 1. Extracts effective_data.attributes.incidents[] 2. Unwraps each item's .data field (IncidentSearchResponseIncidentsData.data) 3. Applies flatten_incident to each IncidentResponseData Wire "incidents list" → flatten_incidents_search (search endpoint) Keep "incidents get" → flatten_incident (get endpoint, no extra wrapper) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Constructs https://app.datadoghq.com/incidents/<id> in flatten_incident so agents can link directly to incidents without a separate lookup. Also adds url to INCIDENT_WEIGHTS at 1.0 (must-have) so it always survives the token budget. Closes the last meaningful gap between pup and MCP for incidents: pup now shows commander, severity, state, created, customer_impacted, and a direct URL — matching MCP's TSV output. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
format_and_print gained a compress_cfg parameter as part of the compact agent output feature, but the workflows instance_list call site was not updated, causing compilation failures across all CI jobs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
rtk.rs uses chrono for ms_to_iso timestamp formatting and is included in the browser WASM build, but chrono was missing from the browser feature's dependency list, causing compilation failures. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…erge compilation Add URL construction to all remaining flatten functions: - flatten_span: https://app.datadoghq.com/apm/trace/<trace_id> - flatten_event: https://app.datadoghq.com/event/event?id=<id> - flatten_monitor (new): https://app.datadoghq.com/monitors/<id> Wire flatten_monitor into flatten_for_command for "monitors list" and "monitors get". All three weights tables gain url at 1.0 (must-have). Also fix post-merge build failures: - idp.rs: format_and_print calls missing the new compress_cfg argument - test_commands.rs: Config initializers missing compact_* fields All 536 unit tests pass. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds opt-in compressed output for agent mode, porting the JSON compression technique from rtk-ai/rtk (MIT). When enabled, the `{status, data, metadata}` agent envelope is returned with data compressed: nulls stripped, long strings truncated, arrays sampled, and fields selected by a per-command token budget. The LLM receives real, actionable values — not type descriptors.
Measured on a 200-monitor list response: 88,950 → 7,595 tokens (91% reduction).
Changes
Core compression (
src/rtk.rs)compress_json_string(new) +filter_json_stringported from rtk-ai/rtk (MIT, Patrick Szymkowiak)CompressConfigstruct with configurable limits (replaces module-level constants)Configuration (
src/config.rs,src/main.rs)--agent-compactglobal flag +AGENT_COMPACT_MODE=1env varcompact_string_trunc/AGENT_COMPACT_STRING_TRUNC(default 200 chars)compact_array_top/AGENT_COMPACT_ARRAY_TOP(default 20 items)compact_array_nested/AGENT_COMPACT_ARRAY_NESTED(default 10 items)compact_item_budget/AGENT_COMPACT_ITEM_BUDGET/--agent-budget(default 150 tokens)Token-budget field selection (
src/formatter.rs,src/rtk.rs)FieldWeights(importance scores 0.0–1.0 per field) + greedy token-budget allocationMONITOR_WEIGHTS,LOG_WEIGHTS,SPAN_WEIGHTS,INCIDENT_WEIGHTS,EVENT_WEIGHTS,DASHBOARD_WEIGHTSPer-command structural flattens
options,creator,created*,matching_downtimes,org_id; keeps id/name/state/type/query/tags/message/modifiedattributes.attributesto top level; flat {id, timestamp, message, service, status, host, tags}dashboards[]array from API wrapper; id/title/url/descriptionattributes.{title,severity,state,created,commander,customer_impacted}+ directurl(https://app.datadoghq.com/incidents/); handles both list (search envelope) and get response shapes{overall_stats, binned[20], trend, url}(matches MCP format); bypasses field-weight selection (structural flatten only)attributes.attributes.{title,service}to top level; drops_ddinternal metadataFormat coverage
Bug fixes
fix(wasm): declaremod rtkinlib.rsfor browser WASM buildfix(wasm): addchronoto browser feature deps (used byms_to_isoinrtk.rs)fix(formatter): removeMETRIC_WEIGHTSfrom routing — field-weight selection was silently dropping the entire series object; structural flatten onlyfix(incidents): unwrapsearch_incidentsdouble envelope (data.attributes.incidents[].data) before flatteningfix(workflows): add missingcompress_cfgarg toformat_and_printCompression breakdown (200-monitor list)
"... +180 more")deleted,priority,restricted_roles, …)Usage
Testing
src/rtk.rscovering compress and schema functionscargo clippy -- -D warningsclean🤖 Generated with Claude Code