Skip to content

feat(mt#1481): Wake originating agent on quality.review Ask responded#923

Merged
edobry merged 5 commits into
mainfrom
task/mt-1481
May 1, 2026
Merged

feat(mt#1481): Wake originating agent on quality.review Ask responded#923
edobry merged 5 commits into
mainfrom
task/mt-1481

Conversation

@minsky-ai
Copy link
Copy Markdown
Contributor

@minsky-ai minsky-ai Bot commented Apr 30, 2026

Summary

Adds a parallel wake-signal path the Ask reconciler fires whenever a quality.review Ask transitions to responded (mt#1481). The originating agent identified by parentSessionId now receives a structured signal so it can address the review, instead of the loop terminating at the operator-notify bell. Wake is additive to the existing operator-notify — bell + notify still fire on every responded transition.

Motivation & Context

mt#1180 shipped end-to-end Ask transport binding for quality.review reviews, but the loop ended at the operator's terminal bell — the agent that filed the review request had no way to learn the review had landed without the operator manually surfacing it. mt#1481 closes that loop.

Spec: mt#1481. Parent: mt#1180. Sibling: PR #858 (mt#1384 reader-side parser fix) — no file overlap.

Design / Approach

The reconciler now accepts an optional WakeSignalSink parameter (defaults to LoggingWakeSignalSink). After askRepository.respond() resolves, the reconciler calls dispatchWake() which builds the signal payload and emits it via the sink. The sink interface is DI-injected so tests can substitute a spy.

Transport choice — log-only as the simplest live path. The spec listed three transport options: mt#1001 mesh push (TODO, not built), mt#697 AG-UI interrupt (DONE evaluation, recommended NOT for mesh push), and mt#1144 cockpit (TODO, not built). mt#951 session-liveness ships storage but liveness is computed, with no writable wake-pending field. Log-only is the simplest live path: every operator running Minsky already reads structured logs and can tail / filter on event=ask.wake. The DI seam is in place so a future cockpit / mesh consumer replaces the default at composition time without touching the reconciler. Choice documented in the spec's Implementation Choice section.

Alternatives considered:

  • Adding a wakePending field to the sessions table — out of scope, requires migration; can be a follow-up if cockpit polling is needed before mesh push lands.
  • Subscribing the originating agent process directly — requires mt#1001 mesh push or AG-UI runtime to land first.

Key Changes

  • src/domain/ask/wake-on-respond.ts (new) — defines WakeSignalPayload (the seven-field contract from the spec), WakeSignalSink interface, LoggingWakeSignalSink default, and the dispatchWake() helper that skips cleanly when parentSessionId is missing.
  • src/domain/ask/reconciler.tsreconcile() and internal reconcileAsk() accept an optional wakeSink parameter (default new LoggingWakeSignalSink()). After respond() succeeds, the reconciler calls dispatchWake(). Sink failures are caught and logged so they cannot roll back respond() or short-circuit the existing operator-notify.
  • src/domain/ask/wake-on-respond.test.ts (new) — covers all three acceptance tests from the spec plus sink-failure isolation and default-sink regression for legacy 3-arg callers.
  • mt#1481 spec Context section updated with the transport-choice rationale.

Testing

  • bun test src/domain/ask/wake-on-respond.test.ts — 9/9 pass (29 expect calls)
  • bun test src/domain/ask/ — 202/202 pass (511 expect calls), confirming no regression in reconciler.test.ts or accounting/accounting.test.ts (both call the legacy 3-arg reconcile())
  • mcp__minsky__validate_typecheck — clean
  • mcp__minsky__validate_lint — clean (0 errors, 0 warnings)
  • bun run format:check — clean

The wake-on-respond test file covers all three spec acceptance tests:

  1. Ask with parentSessionId → wake fires exactly once with the full seven-field payload, Ask transitions to responded, bell + notify still fire.
  2. Ask without parentSessionId → wake skipped, debug log written, Ask still transitions, bell + notify still fire.
  3. Regression: legacy 3-arg reconcile() callers stay green via the default LoggingWakeSignalSink.

Plus extras: sink failure does not roll back respond() or short-circuit notify; null reviewAuthor is preserved; empty-string parentSessionId skips like undefined.

🤖 Generated with Claude Code — had Claude implement mt#1481 end-to-end after the dispatched implementer subagent hit the org usage cap.

edobry added 2 commits April 30, 2026 19:00
Add a parallel wake-signal path the reconciler fires when a quality.review
Ask transitions to responded. Default sink is a structured-log emission
(simplest live path: mt#1001 mesh push, mt#697 AG-UI, mt#1144 cockpit are
all not yet shipped infrastructure). Wake is additive to the existing
operator-notify bell+notify; both fire on every responded transition.

When the Ask has no parentSessionId, the wake skips cleanly with a debug
log. Sink failures do not roll back respond and do not short-circuit
operator-notify. The sink interface is DI-injected so tests can spy.

Pre-existing 3-arg reconcile callers (asks.ts MCP wiring, PR #858
integration test) keep working via the LoggingWakeSignalSink default.
@minsky-ai minsky-ai Bot added the authorship/co-authored Co-authored by human and AI agent label Apr 30, 2026
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


The overall change is coherent and well-tested, but there’s a key observability mismatch that contradicts the PR’s stated operator filtering contract. The default LoggingWakeSignalSink currently logs with message "ask.wake" while setting the structured event field to "quality.review.responded"; the PR text claims operators can filter on event=ask.wake. This silent inconsistency will break filtering and needs to be aligned before merge. Otherwise, DI seam, tests, and failure isolation look sound.

Findings

  • [BLOCKING] src/domain/ask/wake-on-respond.ts:82 — Structured log field mismatch: code logs event: "quality.review.responded" while PR description and comments state operators will filter on event=ask.wake
    In LoggingWakeSignalSink.emit(), the code calls log.info("ask.wake", { event: "quality.review.responded", ...signal }). The PR description and in-file comments claim operators can tail/filter on event=ask.wake. As implemented, the event field is set to quality.review.responded, and the string ask.wake is used as the log message. This is a silent behavior/observability mismatch that will break the stated filtering approach (operators filtering on event=ask.wake will not match these entries). Please align the structured field names/values with the documented contract — e.g., set event: "ask.wake" (and optionally move the review-transition identifier to a separate field like cause: "quality.review.responded") or revise the documentation/tests accordingly and add an assertion to prevent regressions.

Reviewer-bot caught a field-vs-doc mismatch: PR body said operators
filter on event=ask.wake but the structured event field was actually
quality.review.responded. Filtering on the documented field returned
nothing.

Fix: set event=ask.wake (the routing key operators filter on) and move
the upstream-transition identifier to a separate cause field. Add a
DI seam (WakeSinkLogger interface) so a recording fake can assert the
exact field shape; pin the contract with a regression test that fails
if event ever drifts.
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


The wake-on-respond path is thoughtfully integrated and tests are comprehensive, including the fix from the prior review aligning event=ask.wake and adding a regression test. However, the default LoggingWakeSignalSink relies on the agent logger’s info level, which is suppressed in HUMAN mode unless ENABLE_AGENT_LOGS is set. This means wake events will silently not appear for many default deployments, contradicting the PR’s operator-tail contract. Please address this by emitting through a channel that is not suppressed by default or documenting/enforcing required configuration. I also noted minor observability nits (skip logged at debug; misleading wording in the wake-failure warn). Once the logging emission path is reliable by default or clearly enforced, this should be good to merge.

Findings

  • [BLOCKING] src/domain/ask/wake-on-respond.ts:120 — Default log-only wake sink will not emit in HUMAN log mode, contradicting the stated operator-tail contract
    LoggingWakeSignalSink.emit() uses the global log.info("ask.wake", fields) (src/domain/ask/wake-on-respond.ts:113-128). However, the logger implementation explicitly no-ops info calls in HUMAN mode unless enableAgentLogs is true (see src/domain/utils/logger.ts, in createLogger(): info(message, context?) returns early when currentLogMode === LogMode.HUMAN && !enableAgentLogs). Since getLogMode() defaults to HUMAN unless explicitly configured, the default path for many operators will drop these wake events entirely — there will be nothing to tail/filter on, contrary to the PR description's “every operator already reads structured logs and can tail on event=ask.wake.” This is a silent behavior/observability gap: the wake path appears to run and tests pass, but in default deployments no wake logs are produced.

Please either:

  • Route the default sink through a channel that always emits (e.g., program logger via log.cli*/stderr), or elevate to a level/mode that is not suppressed; or
  • Document and enforce at composition time that STRUCTURED mode or ENABLE_AGENT_LOGS=true is required for wake delivery (and add a guard/test), or
  • Provide a real transport (e.g., cockpit/mesh) or a separate dedicated emitter not coupled to agent log mode.
  • [NON-BLOCKING] src/domain/ask/wake-on-respond.ts:96 — Wake skip uses debug-level structured log; may be suppressed in default mode, losing auditability of skipped wakes
    In dispatchWake(), when parentSessionId is missing, the code logs log.debug("ask.wake.skipped", {...}) (src/domain/ask/wake-on-respond.ts:100-106). Given the logger defaults to HUMAN mode and suppresses debug-level agent logs unless ENABLE_AGENT_LOGS is set (src/domain/utils/logger.ts: lines around debug() early-return), these skip events will typically be invisible. If operators need to diagnose why an expected wake didn't fire, this path provides no observable breadcrumb by default. Consider promoting to info with a distinct event=ask.wake.skipped or routing to the program logger so that skip reasons are visible without special configuration. If you intentionally want it hidden, add explicit rationale in code comments.
  • [NON-BLOCKING] src/domain/ask/reconciler.ts:392 — Wake dispatch logs warning mentioning "ask already responded" which may mislead when sink fails pre-notify
    The catch block around dispatchWake() logs log.warn("reconcile: wake-signal dispatch failed (ask already responded)", { askId, error }) (src/domain/ask/reconciler.ts:386-396). The parenthetical "(ask already responded)" is baked into the message but does not necessarily explain the failure; the sink can fail for unrelated reasons (e.g., transport outage). Including this qualifier in the canonical warning may confuse on-call triage. Suggest either removing the parenthetical or moving contextual state (e.g., state: 'responded') into structured fields instead of the message string.

…AN mode

Reviewer-bot round 2 caught a real bug: log.info / log.debug / log.warn
are explicit no-ops in the default HUMAN log mode unless
ENABLE_AGENT_LOGS=true. The previous wake sink used log.info, so wake
events were silently dropped in default deployments — operators tailing
for event=ask.wake saw nothing.

Fix: route the default sink through log.cli (program logger) which
always emits regardless of mode. Output format on stdout:
  ask.wake <JSON-payload>
Operators grep on the line prefix or on the embedded JSON.

Same root cause addresses both NON-BLOCKINGs in the same review:
- dispatchWake skip log moved from log.debug to log.cli with the
  ask.wake.skipped tag (was suppressed in HUMAN mode too).
- Reconciler wake-failure warn moved from log.warn to log.cliWarn
  (was suppressed too); also dropped the misleading
  (ask already responded) parenthetical and put askState in structured
  fields, since sink failures are not state-related.

WakeSinkLogger interface tightened to the cli + cliWarn methods the
sink actually uses. Regression test now asserts the channel (cli, not
cliWarn), the line prefix (ask.wake + space), and the seven payload
fields parsed from the JSON suffix — pins both round-1 and round-2
contracts.

Spec Implementation Choice section updated to record the channel
correction.
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


The reviewer ran but produced no findings. This is not an approval — the model emitted no submit_finding, submit_inline_comment, or conclude_review calls.

@minsky-ai minsky-ai Bot dismissed stale reviews from minsky-reviewer[bot] and minsky-reviewer[bot] May 1, 2026 04:36

Stale: superseded by commit 592f910. The event-field mismatch this review flagged was fixed in that commit (event=ask.wake, cause=quality.review.responded) and pinned by a regression test in wake-on-respond.test.ts. PR head is now 4a9d454, where the channel was further fixed (round 2). See round-3 review for the converged state.

Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


The PR introduces a wake-signal dispatch with DI and good initial tests, but there are blocking concerns. The wake path emits full, unbounded review bodies to stdout, risking log bloat and sensitive data leakage; this also diverges from the notify path’s truncation. The wake dispatch is awaited before notify, which can delay operator notifications despite the stated “parallel” intent; either make it non-blocking or document and test the chosen ordering/latency. Error logging for wake failures embeds JSON in a cliWarn string, inconsistent with structured logging and lacking context. Non-blocking: tests are tightly coupled to logging internals; skipped-wake operator logging isn’t test-pinned; and identifiers may need masking. Address the blocking items and consider the non-blocking suggestions for robustness.

Findings

  • [BLOCKING] src/domain/ask/wake-on-respond.ts:155 — Unbounded reviewBody logged/emitted in wake payload (risk of large logs/PII leakage) and inconsistency with notify truncation
    In dispatchWake(), the constructed WakeSignalPayload includes reviewBody verbatim without any truncation (src/domain/ask/wake-on-respond.ts:155-163). The default LoggingWakeSignalSink then emits the full payload via log.cli, writing the entire review body to stdout. In contrast, reconciler notify path truncates the body to 100 chars before operatorNotify (src/domain/ask/reconciler.ts:394-398), acknowledging size/UX concerns.

Failure modes:

  • Large review bodies could significantly bloat logs and tail output, impacting operator usability and log storage costs.
  • Potential exposure of sensitive content in logs (PII/security details) with no redaction policy, while the operator-notify path already mitigates via truncation.

Requested change: Cap or truncate reviewBody in the wake payload (e.g., consistent 100–200 chars) and/or provide a redaction toggle. At minimum, document and justify why wake must carry full text to logs. Tests should pin the chosen truncation policy.

  • [BLOCKING] src/domain/ask/reconciler.ts:380 — Silent behavior change: wake dispatch happens before operator-notify without documenting ordering or testing notify-after-wake failure mode
    The new wake dispatch block is placed before the operator-notify (src/domain/ask/reconciler.ts:354-383 before notify at 393+). Previously, only notify ran after respond(). Now, a failing wakeSink delays notify until the catch executes. While the catch prevents rollback, ordering and latency implications were not discussed or asserted. If wake emit is slow (e.g., networked sink in future), notify is delayed even though they are described as parallel in intent. Tests cover throwingSink but not slow/awaited sinks delaying notify.

Requested change: Either (a) make wake truly parallel/non-blocking relative to notify (e.g., fire-and-forget with void Promise handling) or (b) document the intentional serial ordering and justify notify latency tradeoff, and add a test that ensures notify still fires even if wake is very slow (e.g., sink returns a Promise that resolves after a delay).

  • [BLOCKING] src/domain/ask/reconciler.ts:372 — Wake failure log uses cliWarn with JSON embedded in a string; inconsistent with logger’s structured error handling and may double-serialize
    On wake error, the code logs via log.cliWarn(ask.wake.failed ${JSON.stringify({...})}) (src/domain/ask/reconciler.ts:366-383). This (1) forces WARN-level program logger with a handcrafted JSON blob inside a flat string, bypassing the existing structured error pathways in logger.ts, and (2) mixes human-readable prefix with JSON, complicating parsing. The success path uses log.cli similarly, but for failures we lose stack/context and potentially duplicate serialization.

Requested change: Use a single structured shape consistently (either program logger with fields or agent logger when enabled). Consider log.cliWarn with separate parameters is not supported; instead, emit a structured agent log (log.error or log.warn with context) for machine parsing and a separate concise cliWarn message for humans. Add a test asserting error path includes askId and error while not depending on exact stringification.

  • [NON-BLOCKING] src/domain/ask/wake-on-respond.test.ts:297 — Test asserts on logging transport internals; brittle to logger format changes
    The LoggingWakeSignalSink test (src/domain/ask/wake-on-respond.test.ts:297-367) asserts exact channel usage (cli vs cliWarn), line prefix "ask.wake ", and parses JSON from the log line. This couples the test tightly to logger formatting and the WakeSinkLogger interface. Future harmless changes in logger implementation, prefixes, or inclusion of metadata will break tests unrelated to wake semantics.

Suggestion: Narrow the contract surface for the sink's output (e.g., expose a dedicated formatter or structured object emission that tests can inspect directly) or relax assertions to key fields without depending on prefix and channel specifics. Document the stability expectations if this format is intended as an operator contract.

@edobry edobry merged commit 2a9595f into main May 1, 2026
2 checks passed
@edobry edobry deleted the task/mt-1481 branch May 1, 2026 04:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

authorship/co-authored Co-authored by human and AI agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant