Skip to content

Conversation

@ThomasK33
Copy link
Member

@ThomasK33 ThomasK33 commented Dec 22, 2025

Unifies modes (Plan/Exec/Compact) and subagents (explore/exec presets) into user-defined Agent Definitions.

Key points:

  • Agent definitions are discovered from ~/.mux/agents/*.md and <projectRoot>/.mux/agents/*.md (project overrides global overrides built-in).
  • YAML frontmatter is parsed + validated; markdown body becomes the agent prompt (layered with mux’s prelude).
  • New unified agentAiDefaults config (with back-compat for legacy modeAiDefaults / subagentAiDefaults).
  • System message composition now supports Agent: <agentId> scoped sections in AGENTS.md (and keeps Mode: scoping working).
  • The task tool now supports agentId (with subagent_type as a legacy alias).

Docs/tests:

  • New docs for Agent Definitions and Agent-scoped instruction sections.
  • Unit coverage for agent scoping + tool policy resolution.

CI/stability fixes included:

  • Make mux api --help runnable in the bundled ESM CLI output (define require via createRequire).
  • Make default worktree srcBaseDir respect MUX_ROOT during E2E runs (special-case ~/.mux / ~/.cmux tilde expansion).
  • E2E harness hardening: per-test MUX_ROOT isolation + sanity check + updated UI helpers for the Agent selector combobox.
  • Avoid git signing hangs in smoke tests and E2E demo repos (commit.gpgSign=false).
  • Stabilize OpenAI image integration tests and Storybook play tests (narrower queries / more reliable vision model).

📋 Implementation Plan

Plan: User-defined Agents (unify Modes + Subagents)

Goal

Unify modes (Plan/Exec/Compact) and subagents (explore/exec presets) into a single, extensible concept: Agent Definitions.

Users can define new “modes” and “subagents” via Markdown files discovered from:

  • ~/.mux/agents/*.md
  • <projectRoot>/.mux/agents/*.md

Each file’s Markdown body becomes the agent’s system prompt (layered with mux’s base prelude), and its YAML frontmatter is strongly typed and parsed/validated on discovery.

Additionally, consolidate model + thinking defaults so modes and subagents share one configuration pipeline (frontmatter defaults → config overrides → workspace overrides → inheritance).


Recommended approach (v1): “Agent Definitions” registry + unified AI defaults

Net LoC estimate (product code): ~1,200–1,700

1) Define the on-disk format + schema

1.1 File layout + precedence

  • Discover *.md files (non-recursive):
    1. Project-local: <projectRoot>/.mux/agents/*.md
    2. Global: ~/.mux/agents/*.md
    3. Built-ins (shipped in code)
  • Collision rule: project-local overrides global overrides built-in by agentId.
  • agentId is the filename without extension (e.g., plan.mdagentId="plan").

1.2 YAML frontmatter schema (strongly typed)

Create AgentDefinitionFrontmatterSchema (Zod) + TS types.

Proposed schema (v1):

---
name: Plan                # required, UI label
description: Create a plan before coding

# Visibility / availability
ui:
  selectable: true         # shows in main-agent UI selector
subagent:
  runnable: false          # allowed as task subagent

# AI defaults (baseline; can be overridden by config.json)
ai:
  modelString: openai:gpt-5.2
  thinkingLevel: medium

# Tool restrictions (simple + predictable)
# - base picks the baseline tool set (plan|exec|compact)
# - tools optionally *narrow* that set using exactly one strategy
policy:
  base: plan               # plan|exec|compact (default: exec)
  tools:
    deny: ["file_edit_insert", "file_edit_replace_string"]
    # OR:
    # only: ["web_fetch", "agent_skill_read"]
---

Notes:

  • agentId is derived from the filename (strip .md); there is no id field in frontmatter.
  • policy.base keeps compatibility with existing “mode-based” behavior (tool defaults + UX expectations).
  • policy.tools is optional. If present, it must specify exactly one of:
    • deny: allow the base tool set except these tools
    • only: deny the base tool set except these tools
  • Tool names are validated against the tool registry; unknown names warn + ignore.
  • All policies are further restricted by runtime “hard-denies” (e.g., subagents can’t re-enable recursive task spawning).

1.3 Parsing rules

Mirror agent skills parsing patterns:

  • Enforce a max file size (same ballpark as skills).
  • Parse YAML frontmatter + markdown body.
  • Derive agentId from the filename (strip .md).
  • On invalid file: skip it and surface a non-fatal diagnostic (logs + optional UI warning).

2) Implement discovery + reading (Node)

Create an agentDefinitionsService analogous to agentSkillsService.

Deliverables:

  • discoverAgentDefinitions({ projectRoot }) → list of index entries (id/name/description/flags + ai defaults + policy metadata).
  • readAgentDefinition({ projectRoot, agentId }) → returns the validated frontmatter + markdown body.
  • Cache results by (projectRoot, mtime) to avoid re-reading on every message; add a cheap invalidation strategy:
    • recompute if any discovered file’s mtime changes OR if the directory listing changes.

3) Unify “modeAiDefaults” + “subagentAiDefaults” into one config model

3.1 New config field

Add a single field to global config (backed by Zod + TS types):

  • agentAiDefaults: Record<string, { modelString?: string; thinkingLevel?: ThinkingLevel }>

3.2 Back-compat migration

On config load:

  • If agentAiDefaults is missing, synthesize it from existing:
    • modeAiDefaults[plan|exec|compact]agentAiDefaults[plan|exec|compact]
    • subagentAiDefaults[type]agentAiDefaults[type]

On config save:

  • Write agentAiDefaults.
  • Either:
    • keep writing legacy keys for one release (lowest risk), or
    • stop writing legacy keys but keep reading them (medium risk).

4) Apply Agent Definitions to system prompt construction

4.1 Main agent

Update buildSystemMessage to incorporate the selected agentId:

  • Always include mux prelude.
  • Inject the agent definition markdown body as a dedicated “Agent Prompt” block.
  • Keep AGENTS.md layering (global + project-local).

4.2 Extend instruction scoping in AGENTS.md

Add support for a new scoped heading:

  • Agent: <agentId>

Rules:

  • For backward compat, keep Mode: plan|exec|compact working.
  • When building system prompt:
    • apply Agent: <agentId> sections
    • apply Mode: <policy.base> sections
    • apply Model: sections (unchanged)
    • apply Tool: sections (unchanged)

5) Make tool policies agent-driven (with subagent safety)

5.1 Policy resolution algorithm

Implement a single policy resolver:

  1. Start from policy.base:
  • plan → existing plan tool policy
  • exec → existing exec tool policy
  • compact → existing compact policy
  1. Optionally apply policy.tools (exactly one):
  • deny: remove tools from the base set
  • only: keep only these tools from the base set
  1. Apply runtime “hard-denies”:
  • If running as a subagent, forcibly deny:
    • task (no recursion)
    • propose_plan (main-agent only)
    • any other tools we consider unsafe for child agents (explicit list)

5.2 Enforce in both frontend + backend

  • Frontend: use resolved policy to hide/disable UI affordances.
  • Backend: treat frontend state as advisory; enforce server-side before tool execution.

6) Unify AI defaults + overrides resolution for agents

Target resolution order (single algorithm used everywhere):

Explicit call args (e.g. /compact -m)                 [highest]
→ Workspace override for agentId (if supported)
→ Global config override: config.agentAiDefaults[agentId]
→ AgentDefinition frontmatter defaults (ai.*)
→ Inherit from parent context (subagents only)
→ Sticky last-used workspace values (main agent only)
→ System fallback model/thinking

6.1 Workspace overrides

Generalize the existing “per-mode workspace override” to “per-agent workspace override”:

  • Workspace metadata: aiSettingsByAgentId: Record<string, AiSettings>
  • Local cache key: workspaceAiSettingsByAgentId:${workspaceId}

When user changes model/thinking while agentId=X, persist overrides under that agentId.

6.2 Replace WorkspaceModeAISync with WorkspaceAgentAISync

  • Drive “effective model/thinking” from (workspaceId, agentId).
  • Continue writing to legacy “active model/thinking” keys as a compatibility bridge until consumers are migrated.

7) UI changes

7.1 Replace ModeSelector with AgentSelector

  • New UI selector lists AgentDefinition entries where ui.selectable: true.
  • Persist the chosen agentId (global or per-project).
  • Keep an ergonomic keybind:
    • TOGGLE_MODE becomes “toggle between last two selected UI agents”.

7.2 Settings: single “Agents” defaults editor

Replace the split “Mode defaults” and “Subagent defaults” views with one:

  • Lists discovered agents, grouped:
    • UI-selectable agents
    • subagent-runnable agents
    • hidden/internal agents (optional)
  • For each agent, show a read-only “Policy” summary (to keep tool permissions understandable):
    • base policy (policy.base)
    • tool filter (deny or only)
    • effective tools preview (computed list; show both main-agent and subagent variants after hard-denies)
  • For each agent, allow configuring:
    • modelString (inherit / override)
    • thinkingLevel (inherit / override)

8) Subagents: switch from presets to agentId

8.1 Tool + API shape

Evolve the task tool schema:

  • New: task({ agentId: string, prompt: string, ... })
  • Keep accepting subagent_type as an alias for 1–2 releases (mapped to agentId).

8.2 Backend behavior

  • Validate requested agentId exists and subagent.runnable: true.
  • Build subagent system prompt from that agent definition.
  • Apply tool policy resolver with “subagent hard-denies”.
  • Apply unified AI defaults resolution (with parent inheritance when agent/config/frontmatter doesn’t specify).

9) Telemetry + timing

Current telemetry expects a small fixed mode union in some places.

  • Add agentId: string to relevant telemetry events.
  • Keep mode as the derived policy.base for backward compat and dashboards.
  • Update sessionTimingService schema so custom agentIds don’t crash timing aggregation.

10) Tests / validation

Unit tests (fast):

  • AgentDefinition parsing:
    • valid frontmatter + body
    • invalid YAML / missing fields
    • agentId derived from filename (strip .md)
  • Discovery precedence:
    • project overrides global overrides built-in
  • Tool policy merge:
    • base policy + deny list
    • base policy + only list
    • subagent hard-deny always wins
  • AI defaults resolution:
    • config override beats frontmatter
    • workspace override beats config

Integration tests (targeted):

  • System message includes the agent definition body + Agent: scoped AGENTS.md sections.
  • Task creation uses agentId and enforces hard-denies.

11) Documentation

Add/extend user docs so this feature is discoverable and predictable:

  • New docs page (e.g. docs/agents.mdx):
    • What an “Agent Definition” is (unifies modes + subagents)
    • Discovery paths + precedence (<project>/.mux/agents/*.md overrides ~/.mux/agents/*.md)
    • File format (frontmatter schema + markdown body semantics)
    • Examples for:
      • a UI-selectable agent (Plan-like)
      • a subagent-runnable definition (Explore-like)
    • Tool policy semantics:
      • policy.base
      • tools.deny vs tools.only
      • subagent hard-denies (cannot be re-enabled)
    • AI defaults resolution order (frontmatter defaults vs config overrides vs workspace overrides vs inheritance)
  • Update docs/instruction-files.mdx:
    • Document new Agent: <agentId> scoped sections
    • Clarify interaction with existing Mode: scoping (derived from policy.base)
  • Add docs navigation entry (docs.json) for the new page.

Alternatives (not recommended for v1)

Option B: Only add custom subagents (keep modes fixed)

Net LoC (product): ~400–700

  • Keep Plan/Exec modes as-is.
  • Add ~/.mux/agents/*.md only for subagent presets.
  • Continue using modeAiDefaults for modes; unify only subagent side.

Pros: much smaller surface area.
Cons: does not deliver “custom modes” and does not unify defaults/UI.

Option C: Agent Definitions replace mode-based tool policy entirely

Net LoC (product): ~2,000–3,000

  • Remove AgentMode/UIMode assumptions.
  • Tool policy becomes fully data-driven (no plan/exec base).

Pros: cleanest long-term architecture.
Cons: high risk; lots of knock-on refactors.


Generated with mux • Model: openai:gpt-5.2 • Thinking: xhigh

@ThomasK33 ThomasK33 force-pushed the modes-config-20hn branch 5 times, most recently from f917d27 to 6d948e1 Compare December 23, 2025 21:44
@ThomasK33 ThomasK33 changed the title 🤖 feat: mode-scoped AI defaults + per-mode overrides 🤖 feat: user-defined agents (unify modes + subagents) Dec 24, 2025
@ThomasK33 ThomasK33 force-pushed the modes-config-20hn branch 6 times, most recently from 172aa21 to be5df72 Compare December 29, 2025 16:50
@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed the "Apply per-agent defaults when syncing model/thinking" thread.

  • Added a cached agentAiDefaults localStorage key (populated from config + kept in sync from Settings)
  • WorkspaceModeAISync now prefers per-agent workspace overrides, then per-agent defaults (config + agent frontmatter), then legacy per-mode defaults
  • Model/thinking updates now also write workspace AI settings under the current agentId so overrides stick when switching agents

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Change-Id: I19d4bc5c5dd1e5b2a38a4a3e6021bf0b8543b839
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I1a5ba1d32ff0a15abae85af904d89074e36be101
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I8d55fb7ca4c3173706e390846b77416f7540af59
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ic00fe3e1cd68818771ac324787461a0427fcfb05
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I9daeab5067c65855a32f44c9626b8f855072fe9d
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ib691f6831627e0e03ecfb26339a6bd9b4a4c310c
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I73e70a106d775fe476864725b484c74a210e0775
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I5b50d8a66415580553f621b90b3b7504d779b59a
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I48698b7b296f11f250b24a9d0889b41df23362d5
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ibf3179de947d6c5cce6b9fdd8c81d55638c2c235
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I33242062387e70d5ada677c9e774bced561db1c6
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I97371ec54fbf9048540fd079ae097c1846e8133a
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I33815ac9fd94d7df809fcd39a35279947b5967bd
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Id8bdbdeb79baf7f56334f17abf6124b02945259d
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: If9e57b0a146c41c59e0028480ae41ff0632e0216
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I673ded0bd5f6a26b4adabb32171f661e54053ec6
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I41ab4e479cd02d0b9f0e3b2a7f7c73a5c1b604a7
Signed-off-by: Thomas Kosiewski <tk@coder.com>
- Update Cmd/Ctrl+Shift+M to cycle Exec → Plan → Other (pinned agent)
- Auto-open AgentModePicker when cycling to Other; Esc confirms pinned agent
- Add tooltips for tools-only/tools-deny counts in Agent Defaults
- Update related UI copy/command palette labels; add unit coverage

Signed-off-by: Thomas Kosiewski <tk@coder.com>

---
_Generated with [`mux`](https://github.com/coder/mux) • Model: `openai:gpt-5.2` • Thinking: `xhigh`_

Change-Id: I24be4a31baa63a7650c163a5237a155657c56187
Change-Id: I73d9ffb02aaf0fb8a80255e8b1c485816d8f0f78
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: If06fdbd712b6d31987fe74707ac4f986518908e3
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I769c8d442e8f5fd28748e0566e729b6b29520ddc
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I8ac5ee71690fd4641dae88635362aebbf1251247
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I9966a03a794ae29b78bd072654b367186a66bd4d
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I65df30bf8b26f448508c1bb6dbd5fac34975023c
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I9f7de0bbb54deb35ac9149b54e60aa19a571656d
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I913ca93b136a3c5e93a3f4638c6714f265ce605c
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I8e9ae0acd43d880f9bf1c4ea515540eabbb75dad
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ib52891f117b95d4598dfb30b1670f6aba36d1cde
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I80b022cf9af068db66c6815891cd6697674d047b
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Id4351ae3a0554ffb427cc24773c4a5639050ecf8
Signed-off-by: Thomas Kosiewski <tk@coder.com>
@ThomasK33
Copy link
Member Author

@codex review

Addressed your note about custom agents clobbering base mode overrides:

  • When updating model/thinking with a non-plan/exec agent active, we now write only the per-agent entry in the per-workspace cache.
  • Backend updateModeAISettings is now only invoked when the active agent matches the base mode.

This prevents a reviewer agent (base=exec) from overwriting exec-mode overrides.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Change-Id: Iaac0d86b08569520ed27bed10443d6d25a611946
Signed-off-by: Thomas Kosiewski <tk@coder.com>
@ThomasK33
Copy link
Member Author

@codex review

Addressed the backend persistence clobber:

  • WorkspaceService.maybePersistAISettingsFromOptions now skips persisting plan/exec AI settings when options.agentId is present and doesn’t match the base options.mode.
  • Added unit coverage in workspaceService.test.ts.

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33 ThomasK33 added this pull request to the merge queue Dec 29, 2025
@ThomasK33 ThomasK33 removed this pull request from the merge queue due to a manual request Dec 29, 2025
Change-Id: I4c55efdb1253de127b6d17c79814e4ee72e5d154
Signed-off-by: Thomas Kosiewski <tk@coder.com>
@ThomasK33 ThomasK33 enabled auto-merge December 29, 2025 18:37
@ThomasK33 ThomasK33 added this pull request to the merge queue Dec 29, 2025
Merged via the queue into main with commit e0507d7 Dec 29, 2025
20 checks passed
@ThomasK33 ThomasK33 deleted the modes-config-20hn branch December 29, 2025 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant