Skip to content

feat(0.10.1): G1-G4 + C1 + H1-H4 follow-up sweep#94

Merged
tbitcs merged 1 commit intodevelopfrom
feat/0.10.1-followup
May 4, 2026
Merged

feat(0.10.1): G1-G4 + C1 + H1-H4 follow-up sweep#94
tbitcs merged 1 commit intodevelopfrom
feat/0.10.1-followup

Conversation

@tbitcs
Copy link
Copy Markdown
Contributor

@tbitcs tbitcs commented May 4, 2026

0.10.1 follow-up sweep — diversity / capabilities / phase routing / trace seal / token threading + docs

Closes the remaining G1-G4, C1, and H1-H4 todos from the 0.10 multi-agent sprint plan.

What landed

G1 — Diversity guard on agents add

ProfileStore.diversity_warnings walks the profile population and warns
when the reviewer (or architect) shares a provider family with the
coder. New PROVIDER_FAMILIES table groups OpenAI-family endpoints
(openai-compat, azure-openai) and Ollama-family backends
(llamacpp, vllm, lmstudio) so a self-hosted vLLM coder + Ollama
reviewer is correctly flagged as same-family. The CLI prints yellow
warnings (non-fatal) and the --json output now includes
diversity_warnings.

G2 — Capability filter

ProfileStore.filter_by_capability(capability) plus a new
specsmith agents list --capability code-review --json flag. The
extension counterpart (AgentsClient.filterAgentsByCapability) ships
in the matching specsmith-vscode PR.

G3 — phase next auto-routes

Advancing the AEE phase now pins a synthetic phase:active route to
the new phase's preferred profile (and seeds the canonical
phase:<key> entry on first advance). The runner can flip the whole
session by listening for one activity instead of teaching the user
seven agents route set commands.

G4 — TraceVault seal on /agent

In-chat /agent <id> writes a decision seal chained into
.specsmith/trace.jsonl so every per-turn profile pin is auditable.
Best-effort: a read-only filesystem / missing project root must never
break the chat loop.

C1 — Token threading

Each provider driver now returns (text, _UsageDelta) and surfaces
real token counts:

  • Ollama — prompt_eval_count + eval_count from the final done message
  • Anthropic — final_message.usage.input_tokens / output_tokens
  • OpenAI — stream_options.include_usage makes the final SSE chunk carry the usage block
  • Gemini — usage_metadata.prompt_token_count / candidates_token_count

When the SDK omits usage, a 4-chars/token heuristic fills in so the
TokenMeter chip is never zero. Counts flow through
ChatRunResult.tokens_in/out/cost_usd into AgentState.credit() and
the per-profile by_profile bucket.

H1-H4 — Docs

  • docs/site/agents.md — preset → route → per-session → BYOE walkthrough
  • docs/site/quickstart.md — reproduction script + GIF placeholder
  • docs/site/vscode-extension.md — eight new commands documented (sister-PR scope)
  • README — multi-agent + BYOE elevator pitch up top

Verification

  • ruff check src/ tests/ — All checks passed!
  • ruff format --check src/ tests/ — 148 files already formatted
  • pytest -q448 passed, 1 skipped
  • python -m specsmith.cli api-surface | diff - tests/fixtures/api_surface.json — clean (no surface drift)

Out of scope / follow-up

  • A5 release tags — landed once this PR + the matching specsmith-vscode#? PR merge
  • Marketplace publish — separate manual step gated on VSCE_PAT

* G1 `agents add` diversity guard - ProfileStore.diversity_warnings
  warns when the reviewer / architect shares a provider family with
  the coder, plus a new PROVIDER_FAMILIES table and provider_family()
  helper. The CLI prints yellow warnings (non-fatal); --json output
  now includes diversity_warnings.
* G2 capability filter - ProfileStore.filter_by_capability + a new
  specsmith agents list --capability flag.
* G3 phase next auto-routes - advancing the AEE phase now pins
  phase:active to the new phase's preferred profile (and seeds the
  canonical phase:<key> entry on first advance).
* G4 TraceVault seal on /agent - in-chat /agent <id> writes a
  decision seal chained into .specsmith/trace.jsonl so every
  per-turn profile pin is auditable. Best-effort: read-only fs etc.
  never breaks the chat loop.
* C1 token threading - each provider driver now returns
  (text, _UsageDelta) and surfaces real token counts: Ollama
  prompt_eval_count + eval_count, Anthropic final_message.usage,
  OpenAI stream_options.include_usage, Gemini usage_metadata. A
  4-chars/token heuristic fills in when the SDK omits usage. Counts
  flow through ChatRunResult.tokens_in/out/cost_usd into
  AgentState.credit() and the per-profile by_profile bucket.
* H1 docs/site/agents.md - preset -> route -> per-session -> BYOE walkthrough.
* H3 README elevator pitch - multi-agent + BYOE up top.
* H4 docs/site/quickstart.md - reproduction script + GIF placeholder.
@tbitcs tbitcs merged commit 9f961a0 into develop May 4, 2026
13 checks passed
@tbitcs tbitcs deleted the feat/0.10.1-followup branch May 4, 2026 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant