feat: Add requireConfirmation elicitation helper for write gating#75
Open
egorpavlikhin wants to merge 7 commits intomainfrom
Open
feat: Add requireConfirmation elicitation helper for write gating#75egorpavlikhin wants to merge 7 commits intomainfrom
egorpavlikhin wants to merge 7 commits intomainfrom
Conversation
Task data now flows through three space-scoped resource templates instead
of three dedicated tools. Bulky activity logs and step timings only travel
when the agent explicitly fetches the URI, so per-call response weight
drops dramatically on the common deployment-investigation path.
URIs (registered into the existing release/dispatch registry):
- octopus://spaces/{spaceName}/tasks/{taskId} -> ServerTask metadata (JSON)
- octopus://spaces/{spaceName}/tasks/{taskId}/details -> ServerTaskDetails (JSON)
- octopus://spaces/{spaceName}/tasks/{taskId}/log -> raw activity log (text/plain)
BREAKING CHANGE: removes get_task_by_id, get_task_details, and get_task_raw.
Callers should switch to resources/read on the URI above (or the existing
read_resource tool backstop on clients without native resource support).
get_task_from_url is unchanged; its consolidation is tracked separately
under the from_url issue.
Discoverability:
- Adds a top-level instructions string on the McpServer so dynamic-discovery
clients learn the resourceUri pattern and the read_resource backstop at
initialize time, before any tool list is fetched.
- Tightens read_resource's tool description with concrete URI examples and
cross-references to the tools that emit URIs.
References on existing tools updated to point at resource URIs:
- get_deployment_from_url's nextSteps now returns taskResourceUri /
taskLogResourceUri instead of suggesting the deleted get_task_details.
- deploy_release's post-deploy helpText points at the task resource URI.
- get_task_from_url's error and description references updated.
- README and docs/working-with-urls.md workflows rewritten to dereference
URIs via resources/read or read_resource.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /log resource returns the entire activity log as plain text, which is fine for short tasks but expensive for long-running deployments. When the agent already knows what it is looking for (an error string, a step name, a regex), grep_task_log returns only matching lines instead. Parameter shape mirrors GNU grep so the schema is self-explanatory to any LLM that knows grep: pattern (regex by default; fixedString:true for literal text) caseInsensitive (-i) invertMatch (-v) fixedString (-F) beforeContext (-B) afterContext (-A) maxCount (-m) The response includes totalMatches across the whole log (so the caller sees the true count even when truncated), 1-indexed line numbers, optional before/after context arrays, and the fullLogResourceUri for fall-through to the resource when grep was too narrow. Implementation reuses SpaceServerTaskRepository.getRaw — no new HTTP code. Pure-function grepLines() exported for unit tests; 12 tests cover each flag, context-window edge cases, regex error handling, and totalMatches accounting under maxCount. Server-level instructions and README updated so dynamic-discovery clients discover the tool alongside the /log resource. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…al path
The octopus://spaces/{spaceName}/tasks/{taskId}/log resource is deliberately
removed. Reasoning:
- Activity logs can be multi-megabyte. Exposing them as an addressable
resource invites agents to fetch the entire body when grep_task_log
returns only the matching lines (and a totalMatches count) at a tiny
fraction of the context cost.
- The "one canonical way to fetch any given shape" rule from the proposed
architecture: a single primitive per use case is easier to discover
correctly than two paths with subtly different cost profiles.
- The /tasks/{id}/details resource already covers structured access (the
ActivityLogs[] tree with categories, timings, and embedded entries) for
callers that need programmatic traversal rather than text search.
Cross-references swept:
- get_deployment_from_url's nextSteps payload now returns a grepTaskLogHint
object (pre-filled tool name, spaceName, taskId) instead of a
taskLogResourceUri field. Integration test updated.
- getTaskFromUrl, deployRelease, README, docs/working-with-urls.md and the
server-level instructions all rewritten to point at grep_task_log for
log search and the /details URI for structured traversal.
- read_resource's description explicitly notes there is no /log URI and
redirects callers to grep_task_log.
- grepTaskLog response field renamed fullLogResourceUri -> taskDetailsResourceUri
(pointing at /details) since the previous URI no longer exists.
- task.test.ts replaces the /log descriptor test with two guard tests:
(1) no descriptor named "task-log" is registered, (2) dispatching a /log
URI returns null. Re-introducing the resource would fail both.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The server-level `instructions` string sent at MCP handshake now includes the configured Octopus server URL on its first line. Lets agents tell the user (or themselves, when reasoning about scope) which Octopus instance they are talking to without an extra round trip. Resolution mirrors getClientConfigurationFromEnvironment's precedence: --server-url flag wins, then OCTOPUS_SERVER_URL env var. If neither is set, the instructions show a placeholder pointing at the right configuration entry points instead of failing silently — the actual connection still fails at runServer time with the existing error path. Pulled the CLI_SERVER_URL assignment earlier in src/index.ts so the instructions are constructed after the resolved URL is known. Removed the previous duplicate assignment further down. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single shared helper that any write tool can call before performing a mutation. Negotiates against client capabilities and degrades gracefully: 1. OCTOPUS_SKIP_ELICITATION=true env var bypasses the gate (automation/CI). 2. Client advertises elicitation capability -> SDK emits elicitation/create and the helper resolves accept/decline/cancel from the user response. 3. Client without elicitation -> helper falls back to a confirm: boolean arg the tool surfaces in its own input schema. Returns a discriminated ConfirmationResult so callers can tell an explicit user "no" (declined / cancelled) apart from "the user was never asked" (confirmationRequired). Tools surface the latter as isError: true so the LLM stops and asks the user instead of treating it as a real cancellation. Wires the helper into create_release and deploy_release, and documents the pattern in CLAUDE.md alongside the existing MCP SDK guidance. Refs AIF-356. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e prose
The confirmationRequired / cancelled response shapes were duplicated
verbatim in createRelease and deployRelease, with only the action noun
("release creation" vs "deployment") differing. As more write tools get
gated, this would multiply.
Move the response shapes into requireConfirmation.ts as a single
unconfirmedResponse(result, { action }) builder. The builder picks the
right shape (isError: true for confirmationRequired, soft cancellation
for declined/cancelled) and capitalizes the action for the cancelled
message. Tools collapse to two lines:
if (!confirmation.confirmed) {
return unconfirmedResponse(confirmation, { action: "deployment" });
}
Adds 6 builder tests covering both branches and the capitalization rule.
Updates CLAUDE.md to show the simpler pattern.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
|
Works in Cursor that supports elicitation. Claude Desktop skips elicitation but I tried to harden the response it gets from the MCP so it goes back and asks the user. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
src/helpers/requireConfirmation.ts— single shared helper for write-gating tool calls. Resolves through three branches:OCTOPUS_SKIP_ELICITATION=trueenv override, client elicitation capability (elicitation/create), or fallback to aconfirmarg on the tool's own schema.ConfirmationResultso callers can tell an explicit user "no" (declined/cancelled) apart from "the user was never asked" (confirmationRequired). Without this distinction the LLM can mistake the no-elicitation fallback for a real cancellation, or — worse — retry blindly withconfirm: true.create_releaseanddeploy_release. TheconfirmationRequiredbranch returnsisError: truewith directive prose so the LLM stops and asks the user instead of treating it as a user cancellation.CLAUDE.mdnext to the existing MCP SDK guidance.Refs AIF-356.
Scope notes
create_releaseanddeploy_releaseare wired here. Other write tools (run_runbook,submit_interruption, non-GETexecute) remain out of scope and will land per-tool.server.tool(...)deprecated form is kept on the wired tools to minimize diff. Migrating those toregisterToolis a separate cleanup.egorp/task-resources— merge the parent first.Test plan
npm run buildcleannpm run lintcleannpm test— 21 new helper unit tests cover all three branches plus the newreasondiscriminator (envSkip / accepted / fallbackConfirm / declined / cancelled / confirmationRequired). All 58 non-integration tests pass; the 13*.integration.test.tsfailures are pre-existing and need a live Octopus atlocalhost:8065.create_release→ Accept / Decline / Cancel each surface the right outcome;deploy_releaselikewise.confirmreturnsconfirmationRequiredwithisError: true; passingconfirm: trueproceeds; passingconfirm: falsecancels.OCTOPUS_SKIP_ELICITATION=truein server env bypasses the prompt entirely.🤖 Generated with Claude Code