Skip to content

Commit 6caba71

Browse files
authored
Fix/mcp tool discovery flow (#95)
* fix(agent): detect bold suggested slash commands * Improve MCP tool discovery flow Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * Add standalone MCP setup CLI commands Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * Address MCP PR review comments Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> --------- Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
1 parent 391eec9 commit 6caba71

15 files changed

Lines changed: 2220 additions & 71 deletions

docs/MCP.md

Lines changed: 48 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Add servers to `~/.hyperagent/config.json` (same format as VS Code's `mcp.json`)
2929
Or use the setup script:
3030

3131
```bash
32-
just mcp-setup-everything # sets up the MCP everything test server
32+
hyperagent --mcp-setup-everything # sets up the MCP everything test server
3333
```
3434

3535
### 2. Start HyperAgent
@@ -127,6 +127,27 @@ are shown during approval.
127127

128128
## Commands
129129

130+
### Standalone setup commands
131+
132+
These command-line options run one setup/show action and then exit. They do
133+
not start an agent session, and they do not require the repository Justfile.
134+
135+
| Command | Action |
136+
|---------|--------|
137+
| `hyperagent --mcp-setup-everything` | Configure the MCP everything test server. Requires npm/npx; first use downloads `@modelcontextprotocol/server-everything`. |
138+
| `hyperagent --mcp-setup-github` | Configure the GitHub MCP server. Requires npm/npx and `GITHUB_TOKEN`; the command will remind you to use `gh auth token` if needed. |
139+
| `hyperagent --mcp-setup-filesystem [dir]` | Configure the filesystem MCP server rooted at `dir` (default `/tmp/mcp-fs`). Requires npm/npx; first use downloads `@modelcontextprotocol/server-filesystem`. |
140+
| `hyperagent --mcp-show-config` | Print configured MCP servers from `~/.hyperagent/config.json`. |
141+
| `hyperagent --mcp-setup-workiq` | Configure Microsoft Work IQ stdio MCP. Pre-fetches `@microsoft/workiq@latest`, runs its interactive EULA command, then writes config. |
142+
| `hyperagent --mcp-add-http <name> <url> [clientId] [tenantId] [scopes] [flow]` | Add a generic HTTP MCP server, optionally with OAuth. |
143+
| `hyperagent --mcp-m365-create-app [args...]` | Create/reuse an Entra app registration for Agent 365 HTTP MCP servers. Requires Azure CLI and `az login`. |
144+
| `hyperagent --mcp-setup-m365 [args...]` | Configure Agent 365 per-service HTTP MCP servers and pre-approve them. |
145+
| `hyperagent --mcp-m365-refresh-servers [args...]` | Refresh the user M365 server catalog using a cached or supplied bearer token. |
146+
| `hyperagent --mcp-m365-show` | Show saved M365 app registration details. |
147+
148+
The Justfile recipes with matching names are development conveniences for this
149+
repository; the `hyperagent` options above are the user-facing path.
150+
130151
### Slash commands
131152

132153
| Command | Action |
@@ -216,7 +237,7 @@ gate uses the MCP spec's `ToolAnnotations` (hints from the server):
216237

217238
The gate runs on the **host side** while the guest VM is paused — the
218239
LLM's handler code sees either a normal result or
219-
`{ error: "Operation denied..." }`. The LLM doesn't need to know about
240+
`{ ok: false, error: "Operation denied..." }`. The LLM doesn't need to know about
220241
the gate; it writes code normally.
221242

222243
Example prompt shown to the user:
@@ -247,7 +268,7 @@ MCP tools with native PPTX generation in a single workflow.
247268
export GITHUB_TOKEN=$(gh auth token)
248269

249270
# Configure the GitHub MCP server
250-
just mcp-setup-github
271+
hyperagent --mcp-setup-github
251272
```
252273

253274
This creates `~/.hyperagent/config.json` with the GitHub server configured,
@@ -341,7 +362,7 @@ registration.
341362
### One-shot setup
342363

343364
```bash
344-
just mcp-setup-workiq
365+
hyperagent --mcp-setup-workiq
345366
```
346367

347368
This writes the following entry to `~/.hyperagent/config.json`:
@@ -417,11 +438,12 @@ Instead of the single stdio `workiq` server you can connect to the
417438
per-service Agent 365 HTTP endpoints directly. This gives you finer
418439
`/mcp enable` control per M365 service and uses MSAL for OAuth.
419440

420-
The setup script uses the VS Code MCP extension's pre-registered client ID
421-
(`aebc6443-...`) which has `McpServers.*` scopes admin-consented in all
422-
M365 Copilot tenants — no per-tenant app registration needed.
441+
Use an Entra public-client app registration for OAuth. You can create or reuse
442+
one with `hyperagent --mcp-m365-create-app`, then configure the per-service MCP
443+
entries from the saved app details.
423444

424-
21 servers are available (see the full list with `just mcp-setup-m365 list`).
445+
The bundled catalog includes the available Agent 365 servers (see the full list
446+
with `hyperagent --mcp-setup-m365 list`).
425447
Common ones:
426448

427449
| Config entry | Service |
@@ -437,25 +459,27 @@ Common ones:
437459
#### Setup
438460

439461
```bash
462+
# One-time: create or reuse an Entra public-client app registration
463+
hyperagent --mcp-m365-create-app
464+
440465
# Configure all M365 servers with browser auth (one-time)
441-
just mcp-setup-m365 all \
442-
aebc6443-996d-45c2-90f0-388ff96faa56 \
443-
<your-tenant-id> \
444-
"" browser
466+
hyperagent --mcp-setup-m365 all \
467+
<your-client-id> <your-tenant-id> "" browser
445468

446469
# Or a subset
447-
just mcp-setup-m365 "mail,teams,planner" \
448-
aebc6443-996d-45c2-90f0-388ff96faa56 \
449-
<your-tenant-id> \
450-
"" browser
470+
hyperagent --mcp-setup-m365 "mail,teams,planner" \
471+
<your-client-id> <your-tenant-id> "" browser
451472

452473
# List available services
453-
just mcp-setup-m365 list
474+
hyperagent --mcp-setup-m365 list
454475
```
455476

456477
This writes config entries AND pre-approves all configured servers so the
457478
LLM can connect them without interactive prompts.
458479

480+
If you just ran `hyperagent --mcp-m365-create-app`, you can pass empty strings
481+
for the client ID and tenant ID to use the saved app details.
482+
459483
#### Auth flows
460484

461485
The `FLOW` argument (last positional) is **required**:
@@ -478,25 +502,23 @@ works with cached tokens.
478502

479503
#### Custom Entra app registration
480504

481-
If your tenant blocks the VS Code client ID, create your own app:
505+
If you already have a tenant-owned public-client app registration, pass it
506+
explicitly instead of using the saved app state:
482507

483508
```bash
484-
just mcp-m365-create-app
485-
# Then use your app's client ID:
486-
just mcp-setup-m365 all <your-client-id> <your-tenant-id> "" browser
509+
hyperagent --mcp-setup-m365 all <your-client-id> <your-tenant-id> "" browser
487510
```
488511

489512
#### Scope
490513

491-
All servers use `ea9ffc3e-8a23-4a7d-836d-234d7c7565c1/.default` (the Agent 365
492-
resource app ID with `.default`), which requests all pre-consented scopes in
493-
one shot. This matches what [a365cli](https://github.com/sozercan/a365cli) uses.
514+
All servers use the Agent 365 resource `.default` scope, which requests the
515+
pre-consented Agent 365 MCP scopes in one shot.
494516

495517
#### Refreshing the server catalog
496518

497519
```bash
498-
just mcp-m365-refresh-servers # uses cached OAuth token
499-
just mcp-m365-refresh-servers --token <bearer> # explicit token
520+
hyperagent --mcp-m365-refresh-servers # uses cached OAuth token
521+
hyperagent --mcp-m365-refresh-servers --token <bearer> # explicit token
500522
```
501523

502524
## HTTP Transport & OAuth

scripts/build-binary.js

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -364,6 +364,14 @@ if (existsSync(skillsSrc)) {
364364
copyDirRecursive(skillsSrc, skillsDst);
365365
}
366366

367+
// Copy MCP setup data needed by standalone CLI setup commands.
368+
const scriptsDst = join(LIB_DIR, "scripts");
369+
mkdirSync(scriptsDst, { recursive: true });
370+
const m365CatalogSrc = join(ROOT, "scripts", "m365-mcp-servers.json");
371+
if (existsSync(m365CatalogSrc)) {
372+
copyFileSync(m365CatalogSrc, join(scriptsDst, "m365-mcp-servers.json"));
373+
}
374+
367375
// Copy @github/copilot CLI (needed by copilot-sdk at runtime)
368376
// The SDK uses import.meta.resolve("@github/copilot/sdk") to find the CLI
369377
console.log("📦 Copying Copilot CLI runtime...");

skills/mcp-services/SKILL.md

Lines changed: 86 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -19,12 +19,14 @@ triggers:
1919
antiPatterns:
2020
- Don't try to manage_plugin("mcp:<name>") — MCP servers are NOT regular plugins
2121
- Don't import from "host:mcp-gateway" — that's the gateway sentinel, not a server
22-
- Don't guess tool names — always call mcp_server_info() first
22+
- Don't guess tool names or parameters — always call mcp_tool_info() first
2323
- Don't hardcode MCP tool schemas — they change when servers update
24+
- Don't call MCP server tools directly from LLM tools — execute them only inside generated handler code
2425
allowed-tools:
2526
- register_handler
2627
- list_mcp_servers
2728
- mcp_server_info
29+
- mcp_tool_info
2830
- manage_mcp
2931
- execute_javascript
3032
- delete_handler
@@ -52,6 +54,28 @@ allowed-tools:
5254
MCP (Model Context Protocol) servers provide external tool capabilities — M365
5355
services, GitHub, databases, custom APIs. Follow this exact workflow:
5456

57+
## Default Behaviour: Handler-Only MCP Execution
58+
59+
For normal user questions against external services — read, list, search, lookup,
60+
summarise recent items — use focused discovery, then execute MCP calls inside a
61+
registered handler:
62+
63+
```
64+
list_mcp_servers()
65+
manage_mcp({ action: "connect", name: "<server>" })
66+
mcp_tool_info({ name: "<server>", query: "<what you need>" })
67+
apply_profile({ profiles: "mcp-network" }) // external MCP calls need wall-clock time
68+
register_handler(...) // import from host:mcp-<server>, await the selected tool
69+
execute_javascript(...)
70+
```
71+
72+
Do **not** call MCP server tools directly from LLM tools. The handler is the
73+
auditable execution boundary for MCP calls. Avoid `file-builder` and
74+
`fs-write`/`fs-read` unless the user asked for an artifact or the task truly
75+
needs large intermediate output. If a result is too large, first retry with
76+
narrower handler arguments: `limit`, `top`, `$top`, `$select`, `$filter`, date
77+
ranges, search query, or a more specific tool.
78+
5579
### Step 1: Discover configured servers
5680

5781
```
@@ -71,36 +95,86 @@ manage_mcp({ action: "connect", name: "work-iq-mail" })
7195
- If not approved → prompts the user for approval (shows tools + security info)
7296
- Returns `{ success: true, tools: [...], module: "host:mcp-<name>" }`
7397

74-
### Step 3: Get tool schemas
98+
### Step 3: Get focused tool schemas
99+
100+
```
101+
mcp_tool_info({ name: "work-iq-mail", query: "search recent messages" })
102+
```
103+
104+
Returns JSON Schema for the relevant tools plus TypeScript declarations. Read
105+
this BEFORE writing handler code — tool names and parameter shapes vary per
106+
server.
107+
108+
If you already know the tool names, request only those tools:
109+
110+
```
111+
mcp_tool_info({ name: "work-iq-mail", tools: ["SearchEmails", "GetEmail"] })
112+
```
113+
114+
Use `mcp_server_info({ name: "work-iq-mail", query: "..." })` only when you
115+
need server-level details as well. Avoid dumping every schema unless the user
116+
explicitly asks to inspect the whole server.
117+
118+
### Step 4: Apply the MCP network profile
75119

76120
```
77-
mcp_server_info("work-iq-mail")
121+
apply_profile({ profiles: "mcp-network" })
78122
```
79123

80-
Returns full JSON Schema for every tool plus TypeScript declarations. Read this
81-
BEFORE writing handler code — tool names and parameter shapes vary per server.
124+
MCP handlers wait on external service calls, so the default 5s wall-clock limit
125+
is often too small even when CPU usage is low. Use `mcp-network` before
126+
executing MCP handlers. It raises wall time without enabling file plugins.
82127

83-
### Step 4: Use the tools in handler code
128+
### Step 5: Register handler code that calls MCP tools
129+
130+
For reads, searches, and lookups, generate handler code that imports from the
131+
server module and awaits the selected MCP tool:
84132

85133
```javascript
86134
import { SearchEmails } from "host:mcp-work-iq-mail";
87135

88136
export default async function handler(event) {
89-
const result = await SearchEmails({ query: "from:boss subject:urgent" });
90-
return { content: [{ type: "text", text: JSON.stringify(result) }] };
137+
const result = await SearchEmails({
138+
query: "from:boss subject:urgent",
139+
top: 5,
140+
});
141+
if (!result.ok) return result;
142+
return { content: [{ type: "text", text: JSON.stringify(result.data) }] };
91143
}
92144
```
93145

146+
MCP calls return a stable envelope inside handler code:
147+
148+
```javascript
149+
{
150+
ok: true,
151+
data: { /* parsed primary result */ },
152+
text: "...", // original text content when available
153+
raw: [/* MCP content */],
154+
meta: [/* secondary content such as correlation IDs */]
155+
}
156+
```
157+
158+
On failure they return `{ ok: false, error: "..." }`. Always check `ok` and
159+
`error` before using `data`.
160+
161+
### Step 6: Execute the handler and iterate narrowly
162+
163+
Run the handler with `execute_javascript`. If output is too large, edit the
164+
handler to narrow the MCP request before enabling file plugins.
165+
94166
Key rules:
95167

96168
- Import from `host:mcp-<server-name>` (the name from list_mcp_servers)
97-
- Tool function names are EXACTLY as returned by mcp_server_info
169+
- Apply `mcp-network` before running MCP handlers; network I/O hits wall-clock limits
170+
- Tool function names are EXACTLY as returned by mcp_tool_info
98171
- All MCP tool calls are async — use `await`
99-
- Tools return `{ content: [{type, text}] }` — parse the text field as needed
100-
- Some servers return embedded JSON (status text + JSON) — extract the JSON part
172+
- Tools return `{ ok, data, text, raw, error }` — check `ok`/`error` first
173+
- `data` is the parsed primary result; use `raw` only when debugging envelopes
174+
- If output is large, narrow the MCP request in handler code before trying file plugins
101175
- **Write operations** (tools not marked `readOnlyHint: true`) may prompt the
102176
user for approval before executing. If denied, the tool returns
103-
`{ error: "Operation denied..." }` — handle this gracefully and explain
177+
`{ ok: false, error: "Operation denied..." }` — handle this gracefully and explain
104178
to the user what happened. Do NOT retry denied operations.
105179

106180
### Server name patterns

0 commit comments

Comments
 (0)