Consolidation writes the model's conversational replies into memory files as content, and destroys the source staging files
Plugin version: 0.7.3 (current main)
Files: pipeline/consolidate.py, scripts/run-consolidation.sh, prompts/consolidate-staging.prompt.txt
Summary
When the consolidation Haiku call returns anything other than the expected ===RECENT===/===ARCHIVE=== envelope — e.g. a refusal, a clarifying question, or a "here's what I'd compress…" preamble — the pipeline writes that conversational text verbatim into recent.md and then renames the source today-*.md staging files to .done.md. The result is a memory file that looks structurally valid (it gets a # Recent header) but contains chat prose instead of memory, and the original daily entries are consumed so a re-run can't recover them.
I hit this across ~24 files in a multi-project setup during a bulk operation that triggered many consolidation runs in a short window. Example of a corrupted recent.md:
# Recent
I cannot complete this compression task. The input is incomplete:
1. Missing recent.md content — ...
2. Missing archive.md content — ...
and a corrupted staging today-*.md:
I don't see a specific task or question in your message — you've provided ...
What would you like help with?
Root cause (two layers)
1. Parser accepts non-conforming output as content — pipeline/consolidate.py:
elif "===RECENT===" in text:
recent = text.replace("===RECENT===", "").strip()
else:
# Fallback: treat entire response as recent <-- chatter becomes "recent"
recent = text.strip()
if recent and not recent.startswith("# Recent"):
recent = "# Recent\n\n" + recent <-- header makes it look valid
A reply with no ===RECENT=== delimiter is exactly what a refusal/question looks like, yet it is accepted and written.
2. Shell writes unconditionally and consumes the source — scripts/run-consolidation.sh:
cp "$RECENT_OUT" "$RECENT_FILE" # writes whatever the parser produced
cp "$ARCHIVE_OUT" "$ARCHIVE_FILE"
...
mv "$staging_path" "${staging_path%.md}.done.md" # source renamed = unrecoverable
There is no validation gate between "Haiku returned something" and "overwrite the long-lived memory file + retire the source."
Aggravating: pipeline/haiku.py already computes is_skip (text.strip().upper().startswith("SKIP")), and the save path honors a SKIP convention — but consolidate() ignores is_skip, and prompts/consolidate-staging.prompt.txt never tells the model it may emit SKIP for unusable input. So there is no defined way for the model to decline, and no guard if it declines anyway.
Impact
- Silent memory corruption:
recent.md / archive.md / today-*.md filled with chat text.
- Data loss: the source staging files are renamed
.done.md, so the daily entries that should have been compressed are gone — a re-run cannot recover them.
- Fails closed-looking-as-success: the run logs
done: N files consolidated.
- Higher risk during bursts of consolidation runs (bulk operations, many projects).
Suggested fix
Validate the model output before it is allowed to overwrite memory, and never retire the source unless the write was accepted. Specifically:
- Give the model an escape hatch in
consolidate-staging.prompt.txt: "If the input is empty, malformed, or you cannot produce the exact envelope, output exactly SKIP and nothing else."
- Validate in
consolidate(): reject output that is is_skip, or that lacks both the ===RECENT=== delimiter and any ## entry line. On rejection, signal "no usable result" instead of falling through to treat chatter as recent. Remove the else: recent = text.strip() fallback.
- Gate the shell write in
run-consolidation.sh: only cp to the live files and only rename staging → .done.md when the pipeline reports a validated success. On skip/failure: log, leave memory files untouched, and leave staging files un-renamed so the next run retries.
I have a patch + tests for all three and am happy to open a PR. Related: #65 (first-backup docs) and #69 (consolidation data-integrity) touch nearby ground but not this fallback path.
Repro sketch
from pipeline.consolidate import parse_consolidation_response
# Model declined / asked a question instead of emitting the envelope:
r, a = parse_consolidation_response("I don't see a task here. What would you like me to compress?")
print(repr(r))
# Today prints: "# Recent\n\nI don't see a task here. What would you like me to compress?"
# i.e. chatter becomes the new recent.md content. Expected: this should be
# rejected (no delimiter, no `## ` entry) rather than written as memory.
Consolidation writes the model's conversational replies into memory files as content, and destroys the source staging files
Plugin version: 0.7.3 (current
main)Files:
pipeline/consolidate.py,scripts/run-consolidation.sh,prompts/consolidate-staging.prompt.txtSummary
When the consolidation Haiku call returns anything other than the expected
===RECENT===/===ARCHIVE===envelope — e.g. a refusal, a clarifying question, or a "here's what I'd compress…" preamble — the pipeline writes that conversational text verbatim intorecent.mdand then renames the sourcetoday-*.mdstaging files to.done.md. The result is a memory file that looks structurally valid (it gets a# Recentheader) but contains chat prose instead of memory, and the original daily entries are consumed so a re-run can't recover them.I hit this across ~24 files in a multi-project setup during a bulk operation that triggered many consolidation runs in a short window. Example of a corrupted
recent.md:and a corrupted staging
today-*.md:Root cause (two layers)
1. Parser accepts non-conforming output as content —
pipeline/consolidate.py:A reply with no
===RECENT===delimiter is exactly what a refusal/question looks like, yet it is accepted and written.2. Shell writes unconditionally and consumes the source —
scripts/run-consolidation.sh:There is no validation gate between "Haiku returned something" and "overwrite the long-lived memory file + retire the source."
Aggravating:
pipeline/haiku.pyalready computesis_skip(text.strip().upper().startswith("SKIP")), and the save path honors a SKIP convention — butconsolidate()ignoresis_skip, andprompts/consolidate-staging.prompt.txtnever tells the model it may emitSKIPfor unusable input. So there is no defined way for the model to decline, and no guard if it declines anyway.Impact
recent.md/archive.md/today-*.mdfilled with chat text..done.md, so the daily entries that should have been compressed are gone — a re-run cannot recover them.done: N files consolidated.Suggested fix
Validate the model output before it is allowed to overwrite memory, and never retire the source unless the write was accepted. Specifically:
consolidate-staging.prompt.txt: "If the input is empty, malformed, or you cannot produce the exact envelope, output exactlySKIPand nothing else."consolidate(): reject output that isis_skip, or that lacks both the===RECENT===delimiter and any##entry line. On rejection, signal "no usable result" instead of falling through to treat chatter asrecent. Remove theelse: recent = text.strip()fallback.run-consolidation.sh: onlycpto the live files and only rename staging →.done.mdwhen the pipeline reports a validated success. On skip/failure: log, leave memory files untouched, and leave staging files un-renamed so the next run retries.I have a patch + tests for all three and am happy to open a PR. Related: #65 (first-backup docs) and #69 (consolidation data-integrity) touch nearby ground but not this fallback path.
Repro sketch