[DX] GAP-18: MemorySaver + --state-file accumulates state across runs causing silent data duplication

## Summary

When `--state-file` is reused across runs (e.g. during development testing), `MemorySaver` accumulates state from previous checkpoints. A `results` list initialized with `state.get("results") or []` carries over from the previous run, causing duplicate entries in subsequent runs.

## Root Cause

`MemorySaver` + `--state-file` persists the full graph state between process invocations. If a LangGraph state field (like `results: list`) is initialized using `state.get("results") or []`, it inherits the accumulated list from the previous run's checkpoint rather than starting fresh.

The runtime does not inject a fresh `thread_id` per invocation, so all runs share the same checkpoint namespace when the same `--state-file` is used.

## Observed Behaviour

1. Run agent against 3 invoices → `results` list has 3 entries
2. Run agent again against 3 new invoices (same state file) → `results` list has 6 entries (3 old + 3 new)
3. Each `--resume` adds another copy

## Workaround

Explicitly reset batch-scoped fields at the start of each logical run — do not use `state.get("field") or []`:

```python
def init_batch(state: InvoiceState) -> dict:
    return {
        "results": [],          # Always start fresh, ignore checkpoint
        "pending_invoices": state["pending_invoices"],
    }
```

Also delete the state file between independent runs during local development.

## Suggested Fix

1. Document that `MemorySaver` + `--state-file` accumulates state across process invocations and that batch-scoped fields must be explicitly reset.
2. Consider injecting a fresh `thread_id` per invocation (unless `--resume` is explicitly passed), so checkpoints from different runs don't share state by default.
3. Add a CLI flag `--new-run` or `--reset-state` that starts a fresh `thread_id` even when `--state-file` is present.

## Impact

- Severity: **Medium**
- Silent data duplication — very hard to diagnose
- Particularly insidious during development when the same state file is reused across iterations


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DX] GAP-18: MemorySaver + --state-file accumulates state across runs causing silent data duplication #1507

Summary

Root Cause

Observed Behaviour

Workaround

Suggested Fix

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[DX] GAP-18: MemorySaver + --state-file accumulates state across runs causing silent data duplication #1507

Description

Summary

Root Cause

Observed Behaviour

Workaround

Suggested Fix

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions