Skip to content

fix(compile): live-reload apm.yml and warn on --clean --watch#1403

Merged
danielmeppiel merged 3 commits into
microsoft:mainfrom
edenfunf:fix/watch-live-reload-and-clean-warning
May 26, 2026
Merged

fix(compile): live-reload apm.yml and warn on --clean --watch#1403
danielmeppiel merged 3 commits into
microsoft:mainfrom
edenfunf:fix/watch-live-reload-and-clean-warning

Conversation

@edenfunf
Copy link
Copy Markdown
Contributor

fix(compile): live-reload apm.yml and warn on --clean --watch

TL;DR

Two behaviors #1349 left on the table when it closed #1345, picked up at @danielmeppiel's invitation in #1351 (comment):

  1. apm compile --watch re-reads apm.yml when apm.yml itself is modified. Editing target: / targets: mid-watch now takes effect on the next file event instead of needing a watcher restart. Pre-fix the value resolved at startup was reused for every recompile, so mid-session edits silently did nothing.
  2. apm compile --watch --clean prints an explicit [!] warning that --clean is ignored, then continues. Pre-fix the flag was silently dropped. Running --clean on every recompile would surprise users mid-session by deleting orphans; running it only on the initial compile would re-introduce a watcher-specific code path — the same trap fix(compile): forward target to watch-mode recompile (closes #1345) #1349 exists to remove. To clean orphaned outputs, run apm compile --clean separately between watch sessions.

Design notes I expect a reviewer to ask about

  • Re-resolution is gated to the file that can change the answer. _recompile only calls _resolve_effective_target when os.path.basename(changed_file) == APM_YML_FILENAME. Instruction-file edits keep using the startup snapshot, so .instructions.md edits don't pay an extra resolver round-trip. Basename equality (not endswith) so a stray backup_apm.yml cannot masquerade as the project root manifest.
  • CLI --target still wins over apm.yml. Mid-session edits to apm.yml's targets: are ignored when the watcher was launched with --target X, matching the one-shot path's priority order. The resolver receives the raw cli_target and applies the same precedence rules. Pinned by test_recompile_on_apm_yml_change_with_cli_target_keeps_cli_priority.
  • target_label_user and cli_target carry the same value at the call site but have different roles. target_label_user (paired with target_label_config) feeds the startup Compiling for ... label; cli_target is the resolver input on apm.yml change. Kept distinct so the label path and the re-resolve path don't accidentally fuse.
  • No new per-recompile log line. When apm.yml changes and the resolved target shifts, the watcher does not print a "now compiling for X" diff. Surfacing the change would require diffing the previous resolution and add visual noise on every recompile. Users see the change via the output files appearing / disappearing. If reviewers want a target-changed signal we can add it as a follow-up; deferred to keep this PR minimal.
  • Lazy import (from .cli import _resolve_effective_target) inside _recompile to break the cli.py → watcher.py → cli.py cycle. Same approach fix(compile): --watch path honors apm.yml targets and --target flag #1351 documented.

How to test

--clean --watch warning

apm compile --watch --clean
# Prints: [!] --clean is ignored in watch mode; run 'apm compile --clean'
#         separately to remove orphaned outputs.
# ...then proceeds with normal watch (no destructive cleanup mid-session).

Mid-session apm.yml reload (both directions)

mkdir -p /tmp/repro/.apm/instructions
cd /tmp/repro
cat > apm.yml <<'EOF'
name: Repro
version: 1.0.0
targets:
- claude
EOF
cat > .apm/instructions/style.instructions.md <<'EOF'
---
description: style
applyTo: "**/*.py"
---
snake_case.
EOF

apm compile --watch &
# 1. Initial: only CLAUDE.md exists.
# 2. Edit apm.yml to `targets: [claude, gemini]` (atomic save, e.g. PowerShell
#    Set-Content or `vim :w`).  Watcher logs `File changed: ./apm.yml` and the
#    next recompile emits AGENTS.md + CLAUDE.md + GEMINI.md.
# 3. Edit apm.yml back to `targets: [claude]`.  Next recompile emits only
#    CLAUDE.md; the previously-written AGENTS.md / GEMINI.md are left on
#    disk untouched (--clean is intentionally separate; see above).

End-to-end run on Windows 11 + watchdog confirmed the watcher log shows [>] File changed: .\apm.yml followed by the correct emission set in both directions; mtimes confirm only the freshly-relevant outputs are rewritten.

Files changed

  • src/apm_cli/commands/compile/cli.py--clean --watch warning in the if watch: branch; forward raw cli_target=target into _watch_mode.
  • src/apm_cli/commands/compile/watcher.pyAPMFileHandler stores cli_target; _recompile re-resolves when the changed file's basename is apm.yml; _watch_mode plumbs cli_target through; docstring updated to describe the snapshot-vs-fresh contract.
  • tests/unit/commands/compile/test_watch_live_reload_and_clean_warning.py — six tests:
    • test_recompile_on_apm_yml_change_reresolves_against_current_file — core reload behavior.
    • test_recompile_on_instruction_file_change_uses_snapshot — non-apm.yml edits skip the resolver round-trip.
    • test_recompile_on_lookalike_filename_does_not_reresolvebackup_apm.yml and friends must NOT trigger reload (basename, not endswith).
    • test_recompile_on_apm_yml_change_with_cli_target_keeps_cli_priority — explicit --target outranks mid-session apm.yml edits.
    • test_clean_watch_emits_warning_and_does_not_run_clean — warning fires, watcher still launches.
    • test_watch_without_clean_does_not_emit_clean_warning — positive control.
    • Toggle-verified: reverting either fix on the current branch makes the corresponding tests fail with assertion messages that name the regression they pin.
  • CHANGELOG.md### Changed entries under [Unreleased].

Related

Two behaviors microsoft#1349 left on the table:

1. `apm compile --watch` re-runs target resolution against the current
   `apm.yml` when `apm.yml` itself is the file event source.  Pre-fix
   the value resolved at startup was reused for every recompile, so
   mid-session edits to `target:` / `targets:` did nothing until the
   watcher was restarted.  Re-resolution is gated to the file that can
   change the answer (basename match -- not `endswith` -- so a stray
   `backup_apm.yml` cannot masquerade as the project root manifest);
   instruction-file edits keep using the startup snapshot.

2. `apm compile --watch --clean` prints an explicit warning that
   `--clean` is ignored in watch mode, then continues.  Pre-fix the
   flag was silently dropped.

CLI `--target X` still outranks mid-session `apm.yml` edits, matching
the one-shot path's priority order: the resolver receives the raw
`cli_target` on every re-run and applies the same precedence rules.

Lazy `from .cli import _resolve_effective_target` inside `_recompile`
to break the cli -> watcher -> cli import cycle.
Copilot AI review requested due to automatic review settings May 19, 2026 16:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves apm compile --watch parity with one-shot compile by (1) re-resolving targets when apm.yml itself changes and (2) surfacing an explicit warning when --clean is used with --watch (since --clean is ignored in watch mode).

Changes:

  • Re-resolve the effective compile target when the changed file is apm.yml, so mid-session targets: edits can take effect.
  • Emit a warning for apm compile --watch --clean instead of silently dropping --clean.
  • Add unit tests covering live reload gating behavior and the --clean warning, plus update the Unreleased changelog.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
src/apm_cli/commands/compile/cli.py Adds a --clean+--watch warning and plumbs raw --target (cli_target) into watch mode.
src/apm_cli/commands/compile/watcher.py Stores cli_target and conditionally re-resolves the effective target when apm.yml changes.
tests/unit/commands/compile/test_watch_live_reload_and_clean_warning.py New regression tests for apm.yml-triggered re-resolution and the --clean watch-mode warning behavior.
CHANGELOG.md Adds two Unreleased “Changed” entries describing the new watch-mode behavior.
Comments suppressed due to low confidence (1)

src/apm_cli/commands/compile/watcher.py:167

  • The _watch_mode docstring says recompiles re-run the resolver against the current apm.yml, but the implementation only re-resolves when the changed file’s basename is exactly apm.yml (instruction-file events explicitly skip it). Please adjust the wording to reflect the actual gating (and, if you persist the updated target, mention the caching behavior).
    ``cli_target`` is the raw ``--target`` argument; recompiles re-run
    the resolver against the current apm.yml so mid-session edits to
    ``targets:`` take effect on the next file event without restarting
    the watcher.

Comment thread src/apm_cli/commands/compile/watcher.py
Comment thread CHANGELOG.md Outdated
Comment thread src/apm_cli/commands/compile/cli.py
edenfunf and others added 2 commits May 20, 2026 00:13
…ent watch contract

Three review comments from PR microsoft#1403:

1. After an apm.yml-driven re-resolve, persist the fresh value to
   `self.effective_target` so subsequent non-apm.yml events do not
   silently revert to the startup snapshot.  Without this, the
   sequence `apm.yml edit -> instructions edit` emits the new family
   set on the first recompile and the wrong family set on the second
   -- AGENTS.md / GEMINI.md written by the apm.yml event become stale
   until the next apm.yml edit.  New test
   `test_apm_yml_change_persists_fresh_target_for_subsequent_events`
   pins the sequence end-to-end.

2. Append (microsoft#1403) to both CHANGELOG entries to match the project's
   one-line-per-PR Keep-a-Changelog convention.

3. Document the two new watch-mode behaviors in the CLI reference:
   apm.yml `target:` / `targets:` mid-session live-reload (with the
   CLI `--target` priority note) and the `--clean` warning.
@sergio-sisternes-epam sergio-sisternes-epam added the panel-review Trigger the apm-review-panel gh-aw workflow label May 24, 2026
@github-actions
Copy link
Copy Markdown

APM Review Panel: ship_with_followups

fix(compile): apm.yml target changes now apply instantly mid-watch (no restart), and --clean --watch warns instead of silently misbehaving -- two long-overdue DX correctness fixes.

cc @sergio-sisternes-epam -- a fresh advisory pass is ready for your review.

PR #1403 closes two distinct watch-mode defects: the silent target-staleness bug that forced users to kill and restart the watcher after editing apm.yml, and the silent swallow of --clean --watch that left users confused about orphan-output behavior. Both fixes are narrow, well-tested at the unit tier, and the CHANGELOG and docs land alongside the code -- all consistent with APM's ship-fast, communicate-clearly operating principle.

The panel's strongest convergent signal is the missing graceful fallback when apm.yml becomes unparseable mid-watch: three independent panelists (devx-ux-expert, supply-chain-security-expert, and test-coverage-expert backed by a missing-evidence block on a resilience-critical surface) raised it. That convergence elevates it above the individual recommended tier; it is a correctness gap on the watch-loop's resilience contract, not a style concern.

The thread-safety gap (no lock around self.effective_target mutation) is real but structurally low-risk given the existing 1-second debounce serializes the vast majority of rapid-fire events in practice -- it warrants a follow-up issue, not a block. The doc-writer's blocking label on "at startup" ambiguity is a single-word doc-polish fix; it carries no code-correctness weight and should be resolved in a fast follow commit rather than gating the merge.

Dissent. doc-writer declared "at startup" ambiguity severity: blocking; devx-ux-expert classified it severity: nit. The substance is identical -- a three-word wording fix in compile.md. CEO sides with the nit classification for ship-gating purposes: the current wording is imprecise but not misleading enough to cause user harm.

Aligned with: Pragmatic as npm -- live-reload and a visible warning mirror the expectation that the tool does what the user intends without requiring manual intervention; editing apm.yml mid-session just works now. DevX -- both fixes eliminate silent failure modes: the first removes an invisible state-staleness footgun, the second surfaces an incompatibility that was previously swallowed, making the watch loop trustworthy enough to leave running unattended.

Growth signal. The "no restart required" story is a reusable beat with outsized conversion value in release notes and social copy. A short screen capture or GIF showing a target: edit in apm.yml taking effect on the next file save -- with zero terminal interaction -- removes a common objection from evaluators who hit the stale-target bug once and moved on. Recommend seeding this into the next minor release post.

Panel summary

Persona B R N Takeaway
Python Architect 0 2 1 Lazy import breaks a real cycle cleanly; in-place self.effective_target mutation is correct and tested; basename gate is better than endswith; one thread-safety gap if watchdog delivers concurrent events.
CLI Logging Expert 0 1 1 Warning renders correctly as [!] yellow; missing re-resolve log when apm.yml triggers reload is the only meaningful gap.
DevX UX Expert 0 2 2 Two solid UX fixes; one recommended gap: malformed apm.yml mid-watch has no graceful recovery path in _recompile, risking a silent watcher crash.
Supply Chain Security Expert 0 1 1 No blocking security regressions. basename guard is correct; lazy relative import is safe. One silent fail-open on YAML parse worth hardening.
OSS Growth Hacker 0 1 2 Solid DX fix that removes a frustrating restart-to-reload loop; CHANGELOG is accurate but undersells the "no restart needed" wow moment.
Doc Writer 1 2 1 Two recommended fixes: "at startup" is ambiguous (should be "when the watcher starts"), and --target priority rule needs one word of clarification for first-time readers.
Test Coverage Expert 0 2 1 7 new tests cover all stated promises at unit/CliRunner tier; two gaps remain: apm.yml-parse-error swallowing and list[str] cli_target threading through re-resolve.

B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.

Top 5 follow-ups

  1. [DevX UX Expert + Supply Chain + Test Coverage] Add try/except around _resolve_effective_target in the apm.yml event handler; warn user and keep previous target on parse error; add missing unit test asserting error is surfaced not swallowed. -- Three-panelist convergence on a user-promise gap backed by a missing-evidence block on a resilience-critical surface; silently dropping config mid-watch is the most impactful correctness risk in the PR.
  2. [Doc Writer] Change "a [!] warning is printed at startup" to "a [!] warning is printed when the watcher starts" in compile.md. -- Fast single-word fix that removes genuine first-reader ambiguity; labeled blocking by doc-writer and should ship in the same PR or an immediate follow commit.
  3. [CLI Logging Expert] Emit a progress line before _resolve_effective_target so users know targets are being re-read when apm.yml changes (not just "Recompiling..."). -- Closes a UX visibility gap that leaves the user unable to confirm live-reload actually fired; directly reinforces the "no restart needed" story.
  4. [DevX UX Expert] Extend the --clean --watch warning to include a one-sentence explanation of WHY the two flags are incompatible. -- Without the causal "why", users will keep retrying the combined form; the explanation is also the seed for a FAQ entry.
  5. [Python Architect] Add threading.Lock in __init__ and wrap the self.effective_target read-modify-write block; open a follow-up issue tracking the race window. -- Real race exists even with debounce on rapid burst events; low-probability but the fix is two lines and the debt should not accumulate.

Architecture

classDiagram
    direction LR
    class APMFileHandler {
        <<EventHandler>>
        +output str
        +chatmode str|None
        +effective_target CompileTargetType|None
        +cli_target str|list|None
        +debounce_delay float
        +on_modified(event) None
        +_recompile(changed_file) None
    }
    class FileSystemEventHandler {
        <<watchdog>>
        +on_modified(event) None
    }
    class CommandLogger {
        <<Logger>>
        +progress(msg) None
        +warning(msg) None
    }
    class CompilationConfig {
        <<ValueObject>>
        +from_apm_yml(...) CompilationConfig
    }
    class AgentsCompiler {
        +compile(config, logger) CompileResult
    }
    class _resolve_effective_target {
        <<PureFunction>>
        +__call__(cli_target) tuple
    }
    FileSystemEventHandler <|-- APMFileHandler : extends
    APMFileHandler *-- CommandLogger : owns
    APMFileHandler ..> CompilationConfig : creates
    APMFileHandler ..> AgentsCompiler : creates
    APMFileHandler ..> _resolve_effective_target : lazy import on apm.yml event
    class APMFileHandler:::touched
    class _resolve_effective_target:::touched
    classDef touched fill:#fff3b0,stroke:#d47600
Loading
flowchart TD
    A(["apm compile --watch"])
    B{"--clean flag?"}
    C["logger.warning: --clean ignored in watch mode"]
    D["_resolve_effective_target(target)"]
    E["_watch_mode(cli_target=target)"]
    F["APMFileHandler.__init__\nself.effective_target = snapshot\nself.cli_target = cli_target"]
    G["Observer.start() -- watchdog thread"]
    H(["File event arrives"])
    I{"basename(changed_file)\n== apm.yml?"}
    J["lazy import _resolve_effective_target\nre-reads apm.yml from disk"]
    K["self.effective_target = fresh_target\n(no lock -- see finding)"]
    L["CompilationConfig.from_apm_yml(target=effective_target)"]
    M["AgentsCompiler.compile(config)"]
    N(["logger.success / logger.error"])
    A --> B
    B -- yes --> C
    C --> D
    B -- no --> D
    D --> E
    E --> F
    F --> G
    G --> H
    H --> I
    I -- yes --> J
    J --> K
    K --> L
    I -- no --> L
    L --> M
    M --> N
Loading

Recommendation

Ship PR #1403. The two behavioral fixes are correct, tested, documented, and CHANGELOG'd. No panelist identified a blocking code correctness regression. The panel's highest-signal gap -- missing graceful fallback on malformed apm.yml -- is a hardening improvement, not a regression introduced by this PR (the watcher had no apm.yml event handling at all before this change). Track it as a priority follow-up issue alongside the doc wording fix and the --clean warning copy improvement. The thread-safety gap warrants an issue but not a block. Merge, then iterate fast.


Full per-persona findings

Python Architect

  • [recommended] self.effective_target mutated without a lock; concurrent watchdog events can race at src/apm_cli/commands/compile/watcher.py
    Watchdog dispatches events on a background thread. If two rapid apm.yml events arrive and the debounce guard is satisfied, two _recompile calls can interleave. Python's GIL serialises bytecode but not the read-modify-write sequence across the two attribute accesses.
    Suggested: Add threading.Lock in __init__ and wrap the if os.path.basename(changed_file) == APM_YML_FILENAME: block with with self._target_lock: to serialise target updates.

  • [recommended] Lazy import in _recompile re-executes on every apm.yml event; module docstring lacks cycle explanation at src/apm_cli/commands/compile/watcher.py
    Python caches modules after first import so the performance hit is negligible, but the cycle is real and should be documented in the module docstring for future refactors.
    Suggested: Add a module-level note: # cli.py imports this module; _resolve_effective_target is imported lazily in _recompile to break the cycle.

  • [nit] os.path.basename could be Path(changed_file).name for pathlib consistency at src/apm_cli/commands/compile/watcher.py
    Minor style nit; both are correct.

CLI Logging Expert

  • [recommended] No progress message when apm.yml triggers target re-resolution at src/apm_cli/commands/compile/watcher.py
    User sees "File changed: apm.yml" and "Recompiling..." but never learns that targets were re-read from disk. If the resolve picks up a new target, the output changes with no explanation.
    Suggested: Emit self.logger.progress('apm.yml changed -- re-resolving targets...', symbol='gear') before calling _resolve_effective_target.

  • [nit] Warning message leads with a flag name rather than the outcome at src/apm_cli/commands/compile/cli.py
    APM message rule: lead with the outcome.
    Suggested: Rephrase to: "Orphaned outputs will not be removed mid-session -- --clean is ignored in watch mode. Run 'apm compile --clean' separately between sessions."

DevX UX Expert

  • [recommended] No graceful fallback when _resolve_effective_target throws mid-watch (malformed apm.yml) at src/apm_cli/commands/compile/watcher.py
    Any parse or FileNotFound error propagates up through the watchdog event handler, likely killing the watcher with a Python traceback. Correct behavior: warn and keep previous target.
    Suggested: Wrap _resolve_effective_target call in try/except, emit logger.warning with error summary and "keeping previous target", continue with self.effective_target unchanged.

  • [recommended] Warning message omits WHY --clean is incompatible with --watch at src/apm_cli/commands/compile/cli.py
    Without the "why", users keep retrying the combined form thinking it is a transient issue.
    Suggested: Extend to: "--clean is ignored in watch mode (running it on every recompile would remove outputs mid-session); run 'apm compile --clean' separately between watch sessions."

  • [nit] compile.md says "[!] warning is printed at startup" but fires before the watcher loop at docs/src/content/docs/reference/cli/compile.md
    Suggested: Change to "a [!] warning is printed before the watcher starts" for precision.

  • [nit] --target priority rule only in watch-mode bullet, not alongside the --target flag description at docs/src/content/docs/reference/cli/compile.md
    Suggested: Add a parenthetical to the --target flag description: "(in --watch mode, outranks apm.yml targets: on every recompile)".

Supply Chain Security Expert

  • [recommended] Silent exception swallow on YAML parse in _resolve_effective_target produces fail-open behavior mid-watch at src/apm_cli/commands/compile/cli.py
    cli.py swallows all exceptions during the parse_targets_field / load_yaml path with bare except Exception: pass. If apm.yml is partially written mid-watch, the resolver silently drops config_target to None and falls back to auto-detect with no user warning.
    Suggested: Replace except Exception: pass with except Exception as exc: logger.warning(f'apm.yml target parse failed: {exc}; falling back to auto-detect').

  • [nit] on_modified endswith filter admits backup_apm.yml events; basename guard downstream is correct but inconsistent at src/apm_cli/commands/compile/watcher.py
    Suggested: Change to os.path.basename(src_path) == APM_YML_FILENAME to match the basename guard in _recompile.

OSS Growth Hacker

  • [recommended] CHANGELOG entry buries the lead -- "no restart needed" should open the sentence, not appear mid-clause
    Power users scanning release notes make a split-second judgment on whether to upgrade.
    Suggested: Rewrite to: "apm compile --watch no longer requires a restart when you edit target: / targets: in apm.yml -- changes take effect on the next file event. Previously the startup value was cached for the whole session. (fix(compile): live-reload apm.yml and warn on --clean --watch #1403)"

  • [nit] Docs bullet phrased as caveat, not capability; could use a Starlight tip admonition
    Suggested: Wrap in :::tip admonition: "Tip -- edit apm.yml mid-watch: target: / targets: changes take effect automatically on the next save."

  • [nit] --clean warning copy could seed a docs FAQ / error-message index
    Suggested: Add one FAQ bullet under a "Watch mode pitfalls" heading in compile.md.

Auth Expert -- inactive

No auth surface touched; changed files are scoped entirely to the compile --watch live-reload and --clean warning feature.

Doc Writer

  • [blocking] "a [!] warning is printed at startup" is ambiguous -- "startup" could mean process startup, not watch invocation at docs/src/content/docs/reference/cli/compile.md
    A first-time reader could interpret "startup" as application/process startup. (Note: CEO weighs this as a nit for ship-gating purposes; it is a fast single-word fix, not a code correctness issue.)
    Suggested: Change to "a [!] warning is printed when the watcher starts".

  • [recommended] The --target priority rule omits "over apm.yml" -- the override relationship is implicit at docs/src/content/docs/reference/cli/compile.md
    The word "still" implies the reader already knows the resolution order.
    Suggested: Drop "still" and write: "Passing --target to apm compile --watch takes precedence over apm.yml target:/targets:."

  • [recommended] No doc for what happens when apm.yml has a parse/syntax error mid-watch at docs/src/content/docs/reference/cli/compile.md
    The new bullet documents the happy path but omits the error path.
    Suggested: Add: "If apm.yml is unparseable at the time of the file event, the recompile is skipped and an error is printed; the watcher continues running."

  • [nit] CHANGELOG --clean entry is slightly longer than surrounding entries; trim remediation sentence at CHANGELOG.md
    Suggested: Trim to: "apm compile --watch --clean now prints a [!] warning that --clean is ignored in watch mode instead of silently dropping the flag. (fix(compile): live-reload apm.yml and warn on --clean --watch #1403)"

Test Coverage Expert

  • [recommended] apm.yml syntax-error mid-watch is silently swallowed; no test asserts user sees error message rather than silence at tests/unit/commands/compile/test_watch_live_reload_and_clean_warning.py
    No test exercises the path where _resolve_effective_target raises. Probed tests/unit/commands/compile/ for YAMLError, ParseError, syntax, except -- zero hits.
    Proof (missing at unit): tests/unit/commands/compile/test_watch_live_reload_and_clean_warning.py::test_recompile_on_apm_yml_change_resolver_raises_logs_error_not_silent -- proves: If apm.yml becomes unparseable mid-watch, the user sees an error message rather than a silent no-op.

  • [recommended] Live-reload certified only at unit tier; integration-with-fixtures floor not met for CLI-visible watch-mode behavior change at tests/unit/commands/compile/test_watch_live_reload_and_clean_warning.py
    Tests 1-5 mock all module boundaries. Tests 6-7 use CliRunner with real apm.yml but mock _watch_mode entirely.
    Proof (passed at unit): tests/unit/commands/compile/test_watch_live_reload_and_clean_warning.py::test_recompile_on_apm_yml_change_reresolves_against_current_file -- proves: apm.yml change mid-watch causes resolver to be called and fresh target forwarded to CompilationConfig. assert mock_from_apm_yml.call_args.kwargs["target"] == fresh

  • [nit] cli_target as list[str] not exercised through the re-resolve path at tests/unit/commands/compile/test_watch_live_reload_and_clean_warning.py
    All five unit tests use cli_target=None or cli_target="claude". The list[str] branch goes through _resolve_effective_target untested.
    Proof (missing at unit): tests/unit/commands/compile/test_watch_live_reload_and_clean_warning.py::test_recompile_on_apm_yml_change_with_cli_target_list_keeps_cli_priority -- proves: apm compile --watch --target claude --target gemini re-resolves with list cli_target, not just single string.

This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.

Note

🔒 Integrity filter blocked 2 items

The following items were blocked because they don't meet the GitHub integrity level.

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by PR Review Panel for issue #1403 · ● 3.3M ·

@github-actions github-actions Bot removed the panel-review Trigger the apm-review-panel gh-aw workflow label May 24, 2026
@danielmeppiel danielmeppiel disabled auto-merge May 26, 2026 18:57
@danielmeppiel danielmeppiel merged commit ec278d3 into microsoft:main May 26, 2026
31 checks passed
@danielmeppiel danielmeppiel mentioned this pull request May 26, 2026
danielmeppiel added a commit that referenced this pull request May 26, 2026
* chore: cut 0.15.0

Move Unreleased -> [0.15.0] - 2026-05-27 and bump pyproject + uv.lock.

Audit applied: every PR merged since v0.14.2 has exactly one
changelog entry; each entry leads with the user-visible impact.

Fixes during audit:
- Add missing entries for #1367, #1403, #1465, #1487, #1492, #1462,
  #1477, #1439, #1484, and the 131679f follow-up commit.
- Collapse the two #1473 lines into one.
- Merge the #1476 Security/GitCache-hardening entry into its Added
  entry (same PR, one logical change).
- Replace bogus #1243 PR ref with the actual merge PR #1308 for the
  persisted transport-flag config.
- Relocate the #1324-delivered marketplace CLI entries (apm pack
  --marketplace / --marketplace-path / --json, outputs map form)
  out of Unreleased and into [0.14.2], where they actually shipped.
  They were mis-attributed to #1317 and orphaned across the 0.14.2
  cut.

Verified locally: ruff check + ruff format --check both clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: danielmeppiel <danielmeppiel@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
danielmeppiel added a commit that referenced this pull request May 27, 2026
…5 drift sweep) (#1511)

* docs: backfill apm-usage and consolidate registry guides (v0.14->v0.15 drift sweep)

Holistic docs-sync retrospective on the v0.14.0->v0.15.0 release window
flagged 23 of 39 user-impact PRs as docs-debt: 7 Rule 4 violations
(apm-usage/ skipped) plus 16 silent-drift PRs. This PR closes the
highest-priority gaps (P0/P1 from the retrospective) in one sweep.

Backfills (apm-usage/ training corpus):
- dependencies.md: registry-sourced APM dep object form (#1471)
- authentication.md: APM_REGISTRY_TOKEN_{NAME} precedence (#1471)
- governance.md: registry_source + allow_non_registry policy (#1471)
- package-authoring.md: apm publish workflow (#1471) and project-scope
  hook command path semantics (#1396)
- commands.md: apm publish entry (#1471), apm config transport keys
  (#1308), apm compile live-reload + --clean --watch warning (#1403),
  Claude Code instruction dedup (#1146), MCP env-var placeholder
  resolution (#1277), AppLocker/WDAC staged-install diagnostic (#1390)

Structural fix (per docs-impact-architect verdict):
- Merge guides/private-registries.md INTO guides/registries.md with
  progressive disclosure (public -> private -> per-dep routing ->
  enterprise link). Adds Starlight redirect for the old slug, patches
  5 cross-references across consumer/, reference/cli/.

Editorial fixes (per editorial-owner sweep):
- integrations/copilot-app.md (#1431): lead with user value before
  WS-IPC/SQLite mechanics; add 'restart the Copilot App once'
  troubleshooting hint
- producer/compile.md: dedup the Claude Code instruction dedup
  explanation (was stated twice)
- enterprise/security.md: reframe defensive memo voice ('do not call
  this X') to user voice ('here is what we provide / here is what we
  don't')

Method: docs-sync skill end-to-end. 5-panelist fan-out plus CDO
synthesis. Every CLI claim in the apm-usage adds was verified against
the live 'apm <verb> --help' surface (S7 tool bridge).

Out of scope (tracked as P1 follow-up): backfilling docs for the 16
silent-drift PRs grouped by subsystem (MCP, install, compile, auth).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: full-corpus regrounding audit (55 pages, 14 surgical fixes)

Wave-batched grounding audit across 55 high-risk pages (CLI ref x27,
schemas/specs x10, consumer ramp x12, onboarding x6). Each page's
factual claims (flags, env vars, exit codes, schema fields, file
paths, code links) was extracted and verified against current
src/apm_cli/ and 'apm <verb> --help' output via S7 tool-bridge.

Fixes applied (14 files):

CLI reference:
- pack.md: add --check-versions, --check-clean flags + exit codes 3, 4
- targets.md: expand copilot detection signals (5, not 1)
- experimental.md: add copilot-app, marketplace-authoring, registries
- install.md: dedup duplicate '## Exit codes' + '## Notes' sections

Schemas / specs:
- lockfile-spec.md: expand package_type enum to full 6-value list
- manifest-schema.md: document plural 'targets:' alias (#1335)
- environment-variables.md: add APM_BROAD_FETCH_DEPTH, APM_COPILOT_APP_DB
- package-types.md: add 5th layout (hook_package, hooks/*.json only)

Consumer ramp:
- install-mcp-servers.md: fix stale code citation + 'Or' -> 'And'
- private-and-org-packages.md: drop nonexistent BITBUCKET_APM_PAT

Onboarding (6 broken navigation links, 4 files):
- quickstart.mdx, getting-started/installation.md,
  getting-started/first-package.md, getting-started/migration.md:
  repoint self-loops and dead routes to actual page paths

Process: dispatched as 6 parallel grounding-verifier agents (general-
purpose) across disjoint page scopes; each agent had edit authority
on its scope and applied surgical fixes inline. Reusable pattern via
the docs-corpus-audit sibling skill design (PANEL + WAVE EXECUTION
+ S7 verifier fan-out, see files/docs-corpus-audit-design.md).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: wave 3 corpus audit + IA-reshuffle dead-link cleanup (53 pages)

Second sweep of the regrounding audit. Covers the 57 pages deferred in
wave 2: producer/ (15), enterprise/ (15), concepts/ (6), integrations/
(7), troubleshooting/ (7), contributing/ (3), reference tail (3), 404.

Process: 6 parallel grounding-verifier agents on disjoint scopes; each
agent extracts factual claims, S7-verifies against current source
('apm <verb> --help' + grep src/apm_cli/), and applies surgical edits
inline. Same pattern as wave 2 (PANEL + WAVE EXECUTION + S7 verifier
fan-out). Orchestrator post-pass swept three cross-corpus broken-link
patterns the per-scope agents could not fix alone.

High-signal factual fixes:

enterprise/governance-guide.md:
- --output-file -> --output (real flag is --output / -o)
- 7+17 check count -> 8+17 (8 baseline checks, not 7)

enterprise/apm-policy.md:
- '16 of 22 checks' -> '17 of 25 checks' (phantom counts)
- conflated --no-policy (install-only) with APM_POLICY_DISABLE (env)

enterprise/apm-policy-getting-started.md:
- dropped 'apm compile' from list of commands that run policy
  (compile enforces zero policy per governance-overview.md L57)

enterprise/policy-reference.md:
- compilation.target.allow: added copilot, gemini, vscode, windsurf,
  agent-skills (only 5 of 9 runtimes were listed)

enterprise/registry-proxy.md:
- 'apm marketplace add --branch main' -> '--ref main' (no --branch flag)

enterprise/security-and-supply-chain.md:
- 3 stale source line-number citations corrected

producer/author-primitives/index.md:
- legacy '.hook.md' extension -> '.json' (hook_integrator scans JSON)
- removed nonexistent '.apm/commands/' subdirectory from layout example

concepts/lifecycle.md:
- 4 reference-page links all pointed at install/ (copy-paste)

Cross-corpus IA-reshuffle dead-link cleanup (orchestrator pass):
- introduction/* -> concepts/* (4 links across 2 files)
- guides/ci-policy-setup/ -> enterprise/enforce-in-ci/ (8 links, 4 files)
- guides/pack-distribute/ -> producer/pack-a-bundle/ (5 links, 4 files)
- guides/dependencies/ -> consumer/manage-dependencies/ (1 link)
- guides/agent-workflows/ -> contextual canonical (3 links, 3 files)
- guides/install-and-use/mcp-servers/ -> consumer/install-mcp-servers/ (3)
- guides/compilation/ -> producer/compile/ (1)
- guides/prompts/ -> producer/author-primitives/prompts/ (2)
- guides/drift-detection/ -> enterprise/drift-detection/ (1)

enterprise/security.md side-fix:
- 'apm unpack scheduled for removal in v0.14' -> drop version target
  (APM is 0.15.0 and unpack still ships marked DEPRECATED in --help).
  Upstream remediation (refresh deprecation timeline in source or
  remove the shim) tracked outside this PR.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: close deferred items from corpus regrounding audit

Closes the three items deferred from the v0.14->v0.15 docs-sync
retrospective and the full-corpus regrounding waves (commits
4f00c2b, 242bb9e, b80da69):

1. apm unpack source-side deprecation timeline
   - src/apm_cli/commands/pack.py: 'will be removed in v0.14'
     -> 'will be removed in a future release'. Current version
     is 0.15.0; the v0.14 target had already passed. Docs were
     softened in wave 3; this mirrors the choice in source.
   - CHANGELOG.md: [Unreleased] Fixed entry.

2. Bucket-C silent-drift backfills (20 PRs, parallel triage)
   - 3 grounding-verifier subagents reviewed 20 of the 21
     bucket-C PRs (#1477 excluded as test-flake fix, no doc
     surface). Verdicts: 17 ALREADY_COVERED or NO_DOC_SURFACE
     (verified honestly against wave 2-3 backfills, not
     manufactured), 3 BACKFILLED:
     - #1385 SSH dep user-from-URL: added supported-form row in
       docs/src/content/docs/consumer/manage-dependencies.md
       and bullet in apm-usage/dependencies.md.
     - #1434 Copilot App schema range [13,15] + warn-not-fail:
       rewrote the 'Schema compatibility' paragraph in
       docs/src/content/docs/integrations/copilot-app.md
       (was factually wrong, claimed [13,13] hard-fail).
     - #1440 Copilot file-based detection signals: added the
       four .github/{instructions,agents,prompts,hooks}/
       directories to the canonical-signals list in
       troubleshooting/compile-zero-output-warning.md and to
       the apm-usage commands.md + package-authoring.md
       auto-detect rules.

3. docs-corpus-audit skill extracted
   - .apm/skills/docs-corpus-audit/SKILL.md: first-class skill
     module emitted from the genesis design artifact used to
     drive waves 2 and 3. Pattern: PANEL + WAVE EXECUTION + S7
     verification. Wave-batched (scales as O(waves), not
     O(claims)), disjoint page ownership (no merge conflicts),
     orchestrator post-pass for cross-corpus drift patterns
     invisible to per-scope agents.
   - references/design-handoff.md: full design artifact preserved
     for future maintainers.
   - Sibling to docs-sync (per-PR), not a replacement.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: fix dead links + address Copilot review findings

Two classes of fix on PR #1511:

1. Deploy Docs CI -- starlight-links-validator failure (2 dead links)
   - getting-started/first-package.md:18 and quickstart.mdx:40 used
     absolute /apm/getting-started/installation/ paths introduced in
     wave 2 (242bb9e). Converted to relative paths matching the
     surrounding link convention.
   - Verified with local 'npm run build' under docs/: 'All internal
     links are valid.'

2. Copilot PR review -- 7 inline factual accuracy comments, all
   verified against source and addressed:
   - apm-usage/package-authoring.md: hook path rewrite is performed
     by 'apm install' (hook integrator pass), not 'apm compile'.
   - apm-usage/dependencies.md + docs/guides/registries.md: registry
     resolver requires semver per apm_cli/deps/registry/semver.py
     (is_semver_range gate). Removed examples implying opaque labels
     (#stable, #v2.0.0, 'latest') route through a registry; updated
     selector tables to flag non-semver refs as rejected for registry
     sources.
   - apm-usage/dependencies.md + docs/guides/registries.md:
     lockfile_version: '2' promotion triggers on registry deps OR
     git-source semver resolution fields (constraint / resolved_tag /
     resolved_at per lockfile.py:_needs_v2, issue #1488), not just
     registry deps.
   - apm-usage/authentication.md: 'token:' in apm-policy.yml is not
     parse-rejected, only surfaces as an 'Unknown top-level policy
     key' warning per policy/parser.py. Still discouraged (leaks to
     repo), but the rejection mechanism is different from apm.yml.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* skill(docs-corpus-audit): refactor under genesis discipline + self-test

Round-trip assessment found the original SKILL.md draft violated
genesis SoC in 7 ways:

1. Invented inline 'grounding-verifier' persona instead of composing
   shared agent personas (python-architect for S7, doc-writer for
   edits). R3 EXTRACT in reverse.
2. Subagent prompt template inlined in SKILL body (~40 lines that
   belong in assets/).
3. IA-reshuffle grep patterns hard-coded in body as bash heredoc --
   the patterns rot per release and belong in scripts/ with --help
   and a versioned update cadence.
4. PHANTOM DEPENDENCY on docs-sync's substrate (.apm/docs-index.yml,
   personas, panelist-return-schema, the apm-usage Rule-4 corpus)
   never declared via tool-call probes -- A9 SUPERVISED EXECUTION
   violation per genesis Step 7b.
5. Missing A8 ALIGNMENT LOOP: wave agents edited inline and nothing
   re-verified the edits grounded.
6. DISPATCH COLLISION risk vs docs-sync: identical 'drift between
   docs and code' triggers; dispatcher LLM could misroute.
7. BUNDLE LEAKAGE: references/design-handoff.md was session-history
   (maintainer-scope), not runtime-loaded. Per genesis 3.5 it must
   NOT ship with the user-facing bundle.

Refactor:
- SKILL.md (218 lines, well under 500-line cap): adds explicit
  Sibling Contract table with docs-sync; declares roster as
  composition of existing personas via relative links;
  PROBE / RISK-TRIAGE / WAVE / POST-PASS / ALIGNMENT-LOOP /
  COMMIT / PR phases; sharpened trigger description naming
  whole-corpus scope.
- assets/subagent-prompt-template.md: extracted the per-scope
  prompt that composes python-architect + doc-writer.
- assets/panelist-return-schema.json: explicit JSON schema for
  agent returns; orchestrator validates and rejects malformed.
- scripts/scan-cross-corpus-drift.sh: deterministic cross-corpus
  drift sweep with 4 pattern groups (ia-links, stale-deprecation,
  absolute-base, ascii-leak). Non-interactive, --help-documented,
  stdout/stderr split per genesis script conventions.
- evals/{trigger,content}-evals.json + README.md: ship gate
  exercising 10+10 trigger queries (docs-sync boundary is the
  load-bearing distinction) and 3 seeded-drift scenarios with
  control baselines.
- Deleted references/design-handoff.md (bundle leak; design
  artifact stays in session state only).

Self-test (proves the refactor works end-to-end):
- Ran scan-cross-corpus-drift.sh against the live corpus; it
  immediately surfaced two genuine misses that wave 3 missed:
  - src/apm_cli/commands/pack.py:606: click help= string still
    said 'removed in v0.14' (the logger.warning at line 633 was
    fixed last commit; this is a sibling string the wave 3 agent
    didn't see because each agent only owned ~9 pages).
  - docs/src/content/docs/reference/cli/unpack.md:9: caution
    banner still said 'scheduled for removal in v0.14'.
- Both softened to 'in a future release' (consistent with the
  rest of the wave 3 choice).
- Lint clean; docs build clean ('All internal links are valid').

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* skill(docs-grounding-verifier): claim-level grounding harness + 7 drift fixes

New sibling skill to docs-corpus-audit. Genesis-designed PIPELINE-of-PANELS
(RAGAS-faithfulness adapted from RAG to docs/code):
- Stage 1: per-page LLM claim extraction
- Stage 2: deterministic grep-based evidence retrieval (S7, no LLM)
- Stage 3: adversarial LLM grounding judge (A7, 4-verdict calibrated)

Empirical proof bundle (.apm/skills/docs-grounding-verifier/evals/runs/proof/):
- 5 high-stakes pages -> 75 atomic claims extracted
- Tally: 63 GROUNDED / 6 PARTIAL / 4 CONTRADICTED / 2 UNSUPPORTED (84%)
- Trigger eval: 20/20 dispatch classification correct
  (precision=1.0, recall=1.0, specificity=1.0, pass_gate=true)

High-confidence drift fixes applied:
- apm-policy.md: MCP transport defaults (was 'block sse/streamable-http
  by default' -> actually allow=None means all permitted; sample policy
  now correctly framed as restriction example)
- apm-policy.md: inheritance levels (was '5 levels including team policy'
  -> canonical chain is 3 semantic levels; 5 is MAX_CHAIN_DEPTH for
  intermediate extends: jumps)
- Plus 5 editorial fixes from prior pass (examples, registries x2,
  security, copilot-app)

Lower-confidence findings (judge retrieval gaps, vague reasoning) left
for follow-up rather than risk introducing new drift via speculative
edits.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: danielmeppiel <danielmeppiel@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] apm compile --watch regenerates GEMINI.md when targets is [claude, cursor] — regression of #1019 in watch code path

4 participants