Skip to content

Phase 5: agents completion runtime knobs + cache invalidate#1

Merged
cmarguta-alg merged 1 commit intolab_week_3from
lab_week_3_phase5
May 6, 2026
Merged

Phase 5: agents completion runtime knobs + cache invalidate#1
cmarguta-alg merged 1 commit intolab_week_3from
lab_week_3_phase5

Conversation

@cmarguta-alg
Copy link
Copy Markdown
Owner

Summary

Phase 5 of the Agent Studio parity expansion. Closes the four completion query-param gaps and the cache-invalidate endpoint flagged by the parity audit against https://agent-studio.eu.algolia.com/rag-openapi.json. Smallest delta of the planned phases — chosen first to validate the per-phase-PR rhythm without introducing brand-new architectural surface.

Stacking note: this PR is intra-fork and bases on lab_week_3 (which is what PR algolia#212 targets upstream). When algolia#212 merges, this PR's base re-targets to algolia:main.

What's in

  • agents try and agents run learn --no-cache, --no-memory, --no-analytics, and --secure-user-token <jwt>. Maps onto the backend's documented completion query params + the X-Algolia-Secure-User-Token header.
  • New agents cache invalidate <agent-id> [--before YYYY-MM-DD] wraps DELETE /1/agents/{id}/cache. Mirrors agents delete's confirmation contract: TTY prompts, non-TTY refuses without --confirm, --dry-run bypasses both since it's non-destructive.
  • First nested cobra group under agents (agents cache <verb>). Phase 6+ will reuse this for providers / conversations / keys / domains.

Polarity calls worth knowing

  • CompletionOptions.No* fields are inverted from the wire on purpose. Backend defaults all three to true; only the negative is interesting at the CLI surface, and memory=true would 422 (the schema is anyOf [{const false}, {type null}]). Wire form omits the param when the No* field is false; sends <param>=false when true. Pinned end-to-end by a table-driven wire-mapping test.
  • --before date validation is deliberately the backend's job. Mirroring Pydantic's date parser in Go would create silent skew on minor backend bumps; the 422-detail surfacing already turns bad input into an actionable message verbatim.

Refactor on second use deferred

Four new flags duplicate mechanically across try.go and run.go (8 lines per command). Could be extracted into a RegisterCompletionFlags helper, but try and run are exactly two consumers and the duplication is straight-line. If a third consumer appears, extract following the rule we've used since Phase 3 (PrintDryRun, NormalizeCompatibility).

Test plan

  • go test ./... — 73/73 packages green (was 72; +1 agents/cache)
  • go test -tags=e2e ./e2e -run TestAgents — 4 testscripts: try ✓, cache ✓ (new), dry-run ✓ (extended), list skipped (env-gated)
  • gofumpt -l clean across api/agentstudio, pkg/cmd/agents, e2e
  • golangci-lint run clean across the touched packages
  • task build succeeds
  • Manual surface check: algolia agents shows new cache group; algolia agents cache shows invalidate; agents run --help and agents try --help show the four new flags
  • Live verification deferred — staging test app still has no LLM provider, so a green completion isn't possible. Cache invalidation doesn't depend on a provider; will revive the live-completion smoke once Phase 6 lands provider CRUD.

Net diff

+736 / −21 across 13 files. 2 new files (pkg/cmd/agents/cache/, e2e/testscripts/agents/cache.txtar).

Made with Cursor

Closes the four completion query-param gaps and the cache-invalidate
endpoint flagged by the parity audit. Smallest delta of the planned
phases — chosen to land first because it validates the per-phase-PR
rhythm without introducing brand-new architectural surface.

What's in:

  - `agents try` and `agents run` learn `--no-cache`, `--no-memory`,
    `--no-analytics`, and `--secure-user-token <jwt>`. Maps onto the
    backend's documented completion query params and the
    X-Algolia-Secure-User-Token header.
  - New `agents cache invalidate <agent-id> [--before YYYY-MM-DD]`
    wraps `DELETE /1/agents/{id}/cache`. Mirrors `agents delete`'s
    confirmation contract: TTY prompts, non-TTY refuses without
    `--confirm`, `--dry-run` bypasses both since it's non-destructive.

Polarity calls (worth knowing for future contributors):

  - The CompletionOptions.No* fields are inverted from the wire on
    purpose. Backend defaults all three to true; only the negative is
    interesting to expose at the CLI surface, and `memory=true` would
    actually 422 (the schema is `anyOf [{const false}, {type null}]`).
    So the wire form omits the param when the No* field is false and
    sends `<param>=false` when true. Polarity is pinned end-to-end by
    a table-driven wire-mapping test in completions_test.go and
    smaller cmd-level guards in try/run.
  - Date validation on `--before` is deliberately the backend's job.
    Mirroring Pydantic's date parser in Go would create silent skew
    on minor backend bumps; the 422-detail surfacing already turns
    bad input into an actionable message verbatim.

Architecture: introduces the first nested cobra group under `agents`
(`agents cache <verb>`). Phase 6+ reuse this for providers /
conversations / keys / domains. Per-verb sub-package wasn't worth it
for one verb; if `cache stats` etc. land later, split then.

Refactor on second use deferred deliberately: 4 new flags duplicate
mechanically across try.go and run.go (8 lines per command). Could
be extracted into a `RegisterCompletionFlags` helper, but try and run
are exactly two consumers and the duplication is straight-line. If a
third consumer appears, extract following the rule we've used since
Phase 3 (`PrintDryRun`, `NormalizeCompatibility`).

Coverage:

  - `api/agentstudio/`: 1 new client method (`InvalidateAgentCache`),
    extended `CompletionOptions`, table-driven wire-mapping test
    covering all four flag combinations + the secure-user-token
    header, four new cache-endpoint tests (no-before / with-before /
    404→ErrNotFound / structured-422-detail surfaces verbatim).
  - `pkg/cmd/agents/cache/`: new package, 6 unit tests covering happy
    path, before-flag, dry-run with and without `--before`, missing
    agent-id, non-TTY-without-confirm, and 404 propagation.
  - `pkg/cmd/agents/try/` and `run/`: one targeted end-to-end test
    each, asserting cobra→opts→client wiring is intact for all four
    new flags (catches polarity transposition, missed forwarding).
  - e2e: new `cache.txtar` (5 contract checks); `dry-run.txtar`
    extended with a regression that the new `--no-*` flags are
    wire-only and don't leak into the dry-run body preview.

Live verification deferred. Staging test app still has no LLM
provider configured, so a green completion isn't possible — but
cache invalidation doesn't depend on a provider, and the unit + e2e
coverage is exhaustive for the wire shapes. A future Phase 6 commit
that lands provider CRUD will revive the live-completion smoke.

Strategy: per-phase PR from here on. This branch
(`lab_week_3_phase5`) is cut from `lab_week_3` HEAD; the PR will
target `cmarguta-alg:lab_week_3` so the diff is bounded to Phase 5
only. When PR algolia#212 merges upstream, this PR's base re-targets to
`algolia:main`.

Net diff: +390 / −21 across 10 files, 2 new files (cache package,
cache.txtar). 13/13 agents packages green, all 4 e2e testscripts
pass (try / cache / dry-run / list-skipped-as-gated), gofumpt
clean, golangci-lint clean.

Co-authored-by: Cursor <cursoragent@cursor.com>
@cmarguta-alg cmarguta-alg merged commit 6eda6da into lab_week_3 May 6, 2026
@cmarguta-alg
Copy link
Copy Markdown
Owner Author

Folding into PR algolia#212 — see algolia#212. Per-phase fork PRs added coordination overhead with no review benefit (PR algolia#212 is a draft pre-review).

@cmarguta-alg cmarguta-alg deleted the lab_week_3_phase5 branch May 6, 2026 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant