Summary
Recent bugs in ShapeStream have been caused by implicit state machine complexity. Both #3773 (infinite loop in replay mode) and the stale cache offset update bug share a common root cause: the client maintains many interdependent state variables that are updated across scattered code paths, making it easy to forget to update related state when taking a specific branch.
The Problem
ShapeStream currently maintains 20+ private state variables that form an implicit state machine:
Core sync state:
#shapeHandle, #lastOffset, #liveCacheBuster, #schema
Connection state:
#connected, #started, #state, #isUpToDate, #isMidStream, #lastSyncedAt
Stale cache handling:
#staleCacheRetryCount, #staleCacheBuster, #lastSeenCursor
SSE handling:
#sseFallbackToLongPolling, #consecutiveShortSseConnections, #lastSseConnectionStartTime
These variables are updated across many methods (#onInitialResponse, #onMessages, #requestShape, etc.), creating an implicit state machine where:
- State transitions are scattered - no single place defines what variables change together
- Valid state combinations are implicit - easy to create invalid states
- Invariants are maintained manually - easy to update one variable but forget another
- Testing requires mocking internals - can't test state machine logic in isolation
Bug Pattern
Both recent bugs followed the same pattern:
PR #3773 (Replay mode infinite loop):
- Code takes early
return when cursor matches
#lastSeenCursor wasn't cleared → replay mode never exits → infinite loop
Stale cache offset bug:
- Code logs warning and continues when stale response detected with existing handle
#lastOffset was updated from stale response → handle/offset mismatch → server errors
Both bugs: early return/branch didn't handle all related state variables.
Proposed Solution
Investigate converting the implicit state to an explicit state machine:
Option A: Discriminated Union States
type SyncState =
| { phase: 'initial'; offset: '-1' }
| { phase: 'syncing'; handle: string; offset: Offset; schema: Schema }
| { phase: 'live'; handle: string; offset: Offset; schema: Schema; cursor: string }
| { phase: 'stale-retry'; handle: string; offset: Offset; retryCount: number; cacheBuster: string }
| { phase: 'paused'; handle: string; offset: Offset }
// Transitions are explicit functions that return new state
function handleResponse(state: SyncState, response: Response): SyncState {
// All related state changes happen together
// TypeScript ensures we handle all cases
}
Option B: XState or Similar
Use a formal state machine library for complex transitions:
- Visual state charts for documentation
- Built-in guards and actions
- Automatic testing of valid transitions
Option C: State Reducer Pattern
Centralize state updates through a reducer:
type StateAction =
| { type: 'RESPONSE_RECEIVED'; handle: string; offset: Offset; ... }
| { type: 'STALE_RESPONSE_IGNORED' } // No state changes!
| { type: 'ENTER_REPLAY_MODE'; cursor: string }
| { type: 'EXIT_REPLAY_MODE' }
function reduce(state: State, action: StateAction): State {
// Single source of truth for state transitions
}
Benefits
- Bugs become obvious - forgetting to handle state in a transition is a type error
- Testable in isolation - state machine can be unit tested without network mocks
- Self-documenting - state types and transitions document valid states
- Easier reviews - PRs show explicit state transition changes
Scope
Start with the most bug-prone areas:
- Stale cache detection and retry logic
- Replay mode (cursor tracking)
- Handle/offset consistency
Later extend to:
- SSE fallback logic
- Pause/resume
- Error recovery
Questions to Answer
- Is the complexity worth it for a client library?
- Which approach fits best with the existing codebase?
- Should we use a library (XState) or roll our own?
- Can we migrate incrementally or is it all-or-nothing?
Related
Summary
Recent bugs in ShapeStream have been caused by implicit state machine complexity. Both #3773 (infinite loop in replay mode) and the stale cache offset update bug share a common root cause: the client maintains many interdependent state variables that are updated across scattered code paths, making it easy to forget to update related state when taking a specific branch.
The Problem
ShapeStream currently maintains 20+ private state variables that form an implicit state machine:
Core sync state:
#shapeHandle,#lastOffset,#liveCacheBuster,#schemaConnection state:
#connected,#started,#state,#isUpToDate,#isMidStream,#lastSyncedAtStale cache handling:
#staleCacheRetryCount,#staleCacheBuster,#lastSeenCursorSSE handling:
#sseFallbackToLongPolling,#consecutiveShortSseConnections,#lastSseConnectionStartTimeThese variables are updated across many methods (
#onInitialResponse,#onMessages,#requestShape, etc.), creating an implicit state machine where:Bug Pattern
Both recent bugs followed the same pattern:
PR #3773 (Replay mode infinite loop):
returnwhen cursor matches#lastSeenCursorwasn't cleared → replay mode never exits → infinite loopStale cache offset bug:
#lastOffsetwas updated from stale response → handle/offset mismatch → server errorsBoth bugs: early return/branch didn't handle all related state variables.
Proposed Solution
Investigate converting the implicit state to an explicit state machine:
Option A: Discriminated Union States
Option B: XState or Similar
Use a formal state machine library for complex transitions:
Option C: State Reducer Pattern
Centralize state updates through a reducer:
Benefits
Scope
Start with the most bug-prone areas:
Later extend to:
Questions to Answer
Related
claude/investigate-caching-bug-Rcwin- Fix stale cache offset update bug