Enhancement: promote durable lessons onto enforced surfaces (close the capture→behavior gap)

## Summary

Compounding engineering only compounds if a captured lesson reliably becomes
*changed behavior*. Today the plugin mechanizes every step of the knowledge
pipeline **except** that last one:

- **Capture** is mechanized — `ce:compound` writes a structured lesson to `docs/solutions/`.
- **Maintenance** is mechanized — `ce:compound-refresh` keeps/updates/replaces/archives within `docs/solutions/`.
- **Retrieval** is mechanized — `learnings-researcher` feeds past lessons into `ce:plan` / `ce:review`.
- **Promotion to enforced behavior** is *operator-judgment-only.* Nothing decides whether a lesson is durable enough to stop being a doc you might re-read and start being a rule that fires whether or not anyone re-reads it.

The result is an asymmetry: the automation arc runs capture → maintain →
retrieve, then stops at the surface where lessons stay *retrieval-dependent*. A
lesson only changes behavior if an agent happens to retrieve it and chooses to
apply it. For lessons that should be non-optional, that is the wrong failure
mode.

The plugin already touches the memory surface in one direction: `ce:compound`'s
**Phase 0.5 auto-memory scan** reads `MEMORY.md` as *context* for the writeup.
This proposal extends that adjacency the other way — let `ce:compound` also route
durable lessons *out* toward memory and other enforcement rungs.

## The proposal: a promotion/escalation step

Add one step to the compound flow. After `ce:compound` writes the lesson, a
**promotion triage** judges how durably the behavior change must hold, then
routes the lesson to **one rung** of an enforcement ladder (the ladder below is
illustrative — the contribution is the *triage step*, not this exact ordering),
increasing in enforcement strength and decreasing in retrieval-dependence:

```
captured lesson (docs/solutions/)
        │  promotion triage: how durable must this behavior change be?
        ▼
  rung 1  memory / MEMORY.md pointer        — recalled, still optional
  rung 2  always-loaded rule (CLAUDE.md/AGENTS.md) — read every session
  rung 3  hook (pre/post tool-use)          — fires mechanically
  rung 4  automated/contract test           — fails the build if violated
  rung 5  skill                             — becomes a reusable capability
```

The default for most lessons stays "leave it in `docs/solutions/`" — promotion is
the exception, not the rule. The triage answers a single question: *is this
lesson durable/recurrent enough that retrieval-dependence is an unacceptable
failure mode?* If yes, it picks the lowest rung that removes that risk.

### The triage heuristic: recurrence is the strongest promote-signal

The hard part is deciding *what* to promote without promoting everything (which
would bloat the enforced surfaces and erode adherence). The signal we found most
reliable is **recurrence** — a lesson that has bitten more than once — with
blast-radius as a secondary signal. Recurrence is a better trigger than severity
or blast-radius alone because it is *observed*, not predicted: it distinguishes
"this will keep happening" from "this looked scary once."

This heuristic is earned by evidence, not asserted — see below.

## Evidence: worked instances (both halves of the heuristic)

From running this loop in a personal multi-repo fleet (CE across several Active
Projects, prototype living in shared always-loaded agent guidance):

1. **Promote (loop closed).** A recurring lesson — *"registered / imported /
   tested" proves a tool's wiring EXISTS, not that it adds VALUE; don't
   rubber-stamp wired-up things when culling* — kept biting across cull rounds.
   We promoted it to a memory + an always-loaded rule (rungs 1–2). The very next
   round, the agent's behavior changed *without anyone re-reading the doc*: a
   pass that had culled 0 tools on a load-bearing-only test re-vetted to 6 culls.
   Recurrence triggered promotion; promotion closed the loop.
1. **Withhold (loop correctly left open).** A different lesson — a
   symlink-assumption bug in a harness-retirement script (it treated a path as a
   real directory when it was a symlink) — was captured but had **not yet
   recurred**. The recurrence heuristic correctly *deferred* promotion: it sits
   at rung 1 (memory) on a wait-and-see basis. If it recurs it escalates to an
   enforced rung (e.g. a test); if it never does, it never costs enforcement.
   This is the heuristic correctly *withholding*.

Together these show the trigger working in both directions — promoting what has
proven recurrent, withholding what has not.

## A minimal forcing function (so promotion isn't silently skipped)

Promotion is only valuable if it actually happens and actually lands. The following additions
make that auditable without building a telemetry subsystem:

- A **minimal index** of promoted lessons — enough fields to answer "what's
  promoted, to which rung, and when was it last checked": roughly *captured-date,
  rung, status, last_reviewed.* (Exact schema intentionally left open.)
- A **periodic effectiveness retrospective** — a passive nudge to review whether
  each promoted change actually realized in behavior, and to escalate /
  de-escalate / cull the rung accordingly. This needs *some* forcing function;
  the right cadence depends on a project's lesson velocity, so it should be
  user-tunable rather than fixed. (Our fleet uses a hybrid "N-new-lessons OR
  M-days, whichever first" trigger as one data point — not a recommendation.)

This is a retrospective, not active monitoring — the loop is closed by a periodic
audit, not by watching every lesson continuously.

## Where it fits (recommendation + design space)

We recommend **a new phase in `ce:compound`, immediately after the lesson is
written** — context is freshest there, and the triage is a natural continuation
of capture. Alternatives in the design space:

- A step inside `ce:compound-refresh` (promotion as a maintenance action
  alongside keep/update/replace/archive).
- A dedicated sibling skill (`ce:promote-learning`) invoked after compound.

We lean toward the in-`ce:compound` phase because our own prototype lives as the
rung-2 outcome (an always-loaded rule), which is exactly "promote while the
lesson is fresh."

## Prototype / prior art (ours)

We run this today as a written workflow definition ("promotion triage → one rung
of {memory · always-loaded rule · hook · contract test · skill}") in our fleet's
always-loaded agent guidance — i.e. the design itself sits at rung 2 of its own
ladder. It is definition-first; the index + retrospective mechanization is in
progress. We're offering the shape as a working spec, not asking the plugin to
adopt our file layout.

## Relation to prior issues

- **#171 (Closed-Loop Self-Improvement, closed `NOT_PLANNED`)** — #171 proposed a
  broad self-improving *subsystem*: auto-capture, agent scoring, execution
  telemetry, prompt evolution, dashboards. **This proposal is deliberately not
  that.** No telemetry, no scoring, no dashboards, no auto-tuning. More
  importantly, every loop-closing mechanism in #171 (auto-capture + retrieval +
  review-agent prompt-tuning) leaves a lesson in the *retrieval-dependent* layer —
  it makes capture cheaper and retrieval likelier, but a durable lesson still
  only changes behavior if re-read. The promotion-to-enforcement rung is the one
  thing #171 did not cover, and it is a single node, not a subsystem — which is
  also why it may be a more tractable scope than #171 was.
- **#494 (Cross-project knowledge sharing)** and **#111 (docs/solutions schema
  scaffolding + agentic feedback)** — both improve the `docs/solutions/` layer
  (sharing it across repos; bootstrapping its schema). Orthogonal and
  complementary: this proposal is about lessons *graduating out of* that layer
  onto enforced surfaces, not about the layer itself.

## Scope / non-goals

- **One node, not a subsystem.** A triage step + a minimal index + a
  retrospective nudge. Explicitly not #171's telemetry/scoring/auto-tuning.
- **Promotion is the exception.** Most lessons stay in `docs/solutions/`; the
  default is not to promote.
- **The triage is a judgment node**, performed by the agent/operator at compound
  time — not a fully automated classifier. Recurrence is the headline signal, not
  a hard rule.
- **No fixed cadence or schema mandated.** The index fields and retrospective
  cadence are left to the implementation / user configuration.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: promote durable lessons onto enforced surfaces (close the capture→behavior gap) #866

Summary

The proposal: a promotion/escalation step

The triage heuristic: recurrence is the strongest promote-signal

Evidence: worked instances (both halves of the heuristic)

A minimal forcing function (so promotion isn't silently skipped)

Where it fits (recommendation + design space)

Prototype / prior art (ours)

Relation to prior issues

Scope / non-goals

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Enhancement: promote durable lessons onto enforced surfaces (close the capture→behavior gap) #866

Description

Summary

The proposal: a promotion/escalation step

The triage heuristic: recurrence is the strongest promote-signal

Evidence: worked instances (both halves of the heuristic)

A minimal forcing function (so promotion isn't silently skipped)

Where it fits (recommendation + design space)

Prototype / prior art (ours)

Relation to prior issues

Scope / non-goals

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions