Investigate converting ShapeStream to explicit state machine

## Summary

Recent bugs in ShapeStream have been caused by implicit state machine complexity. Both [#3773](https://github.com/electric-sql/electric/pull/3773) (infinite loop in replay mode) and the stale cache offset update bug share a common root cause: the client maintains many interdependent state variables that are updated across scattered code paths, making it easy to forget to update related state when taking a specific branch.

## The Problem

ShapeStream currently maintains **20+ private state variables** that form an implicit state machine:

**Core sync state:**
- `#shapeHandle`, `#lastOffset`, `#liveCacheBuster`, `#schema`

**Connection state:**
- `#connected`, `#started`, `#state`, `#isUpToDate`, `#isMidStream`, `#lastSyncedAt`

**Stale cache handling:**
- `#staleCacheRetryCount`, `#staleCacheBuster`, `#lastSeenCursor`

**SSE handling:**
- `#sseFallbackToLongPolling`, `#consecutiveShortSseConnections`, `#lastSseConnectionStartTime`

These variables are updated across many methods (`#onInitialResponse`, `#onMessages`, `#requestShape`, etc.), creating an implicit state machine where:

1. **State transitions are scattered** - no single place defines what variables change together
2. **Valid state combinations are implicit** - easy to create invalid states
3. **Invariants are maintained manually** - easy to update one variable but forget another
4. **Testing requires mocking internals** - can't test state machine logic in isolation

## Bug Pattern

Both recent bugs followed the same pattern:

**PR #3773 (Replay mode infinite loop):**
- Code takes early `return` when cursor matches
- `#lastSeenCursor` wasn't cleared → replay mode never exits → infinite loop

**Stale cache offset bug:**
- Code logs warning and continues when stale response detected with existing handle
- `#lastOffset` was updated from stale response → handle/offset mismatch → server errors

Both bugs: **early return/branch didn't handle all related state variables.**

## Proposed Solution

Investigate converting the implicit state to an explicit state machine:

### Option A: Discriminated Union States

```typescript
type SyncState = 
  | { phase: 'initial'; offset: '-1' }
  | { phase: 'syncing'; handle: string; offset: Offset; schema: Schema }
  | { phase: 'live'; handle: string; offset: Offset; schema: Schema; cursor: string }
  | { phase: 'stale-retry'; handle: string; offset: Offset; retryCount: number; cacheBuster: string }
  | { phase: 'paused'; handle: string; offset: Offset }

// Transitions are explicit functions that return new state
function handleResponse(state: SyncState, response: Response): SyncState {
  // All related state changes happen together
  // TypeScript ensures we handle all cases
}
```

### Option B: XState or Similar

Use a formal state machine library for complex transitions:
- Visual state charts for documentation
- Built-in guards and actions
- Automatic testing of valid transitions

### Option C: State Reducer Pattern

Centralize state updates through a reducer:

```typescript
type StateAction = 
  | { type: 'RESPONSE_RECEIVED'; handle: string; offset: Offset; ... }
  | { type: 'STALE_RESPONSE_IGNORED' }  // No state changes!
  | { type: 'ENTER_REPLAY_MODE'; cursor: string }
  | { type: 'EXIT_REPLAY_MODE' }

function reduce(state: State, action: StateAction): State {
  // Single source of truth for state transitions
}
```

## Benefits

1. **Bugs become obvious** - forgetting to handle state in a transition is a type error
2. **Testable in isolation** - state machine can be unit tested without network mocks
3. **Self-documenting** - state types and transitions document valid states
4. **Easier reviews** - PRs show explicit state transition changes

## Scope

Start with the most bug-prone areas:
1. Stale cache detection and retry logic
2. Replay mode (cursor tracking)
3. Handle/offset consistency

Later extend to:
- SSE fallback logic
- Pause/resume
- Error recovery

## Questions to Answer

1. Is the complexity worth it for a client library?
2. Which approach fits best with the existing codebase?
3. Should we use a library (XState) or roll our own?
4. Can we migrate incrementally or is it all-or-nothing?

## Related

- #3773 - Fix infinite loop in replay mode when CDN returns same cursor
- Current branch: `claude/investigate-caching-bug-Rcwin` - Fix stale cache offset update bug

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate converting ShapeStream to explicit state machine #3785

Summary

The Problem

Bug Pattern

Proposed Solution

Option A: Discriminated Union States

Option B: XState or Similar

Option C: State Reducer Pattern

Benefits

Scope

Questions to Answer

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigate converting ShapeStream to explicit state machine #3785

Description

Summary

The Problem

Bug Pattern

Proposed Solution

Option A: Discriminated Union States

Option B: XState or Similar

Option C: State Reducer Pattern

Benefits

Scope

Questions to Answer

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions