Skip to content

feat(tic-tac-toe): UI ↔ state binding round-trip via the StateBus#198

Draft
91jaeminjo wants to merge 11 commits into
mainfrom
feat/tic-tac-toe
Draft

feat(tic-tac-toe): UI ↔ state binding round-trip via the StateBus#198
91jaeminjo wants to merge 11 commits into
mainfrom
feat/tic-tac-toe

Conversation

@91jaeminjo
Copy link
Copy Markdown
Collaborator

Summary

Tic-tac-toe is the first concrete consumer of AG-UI's StateDeltaEvent channel for surface-driven UIs. It exercises the full bidirectional binding (tap → stateOverlay → server tool → StateDelta → bus → projection → widget rebuild) end-to-end and pins the protocol with an integration test.

Backend half lives on feat/tic-tac-toe of soliplex/soliplex (separate PR coming).

What's in the box

New module — lib/src/modules/tic_tac_toe/

  • State types: Cell, TurnPair, TicTacToeViewMode, TicTacToeError, TicTacToeClientState (immutable, value-equal, copyWith).
  • Server-state record + projection: TicTacToeServerState and TicTacToeProjection extends StateProjection<TicTacToeServerState?> — tolerant of malformed input (returns null).
  • BoardRenderState.compose(server, client): composite render record that maps the design doc's button-enablement table into canSend / canCancel / canUndo / canRedo / canNewGame flags. Cells expose both mark (rendered, including pending overlay) and serverMark (server-board occupancy, used for tap-enablement).
  • TicTacToeController: per-thread controller. Constructor wires bus.project(TicTacToeProjection()) → server signal, owns a Signal<TicTacToeClientState> for local overlay, exposes a boardRender computed that composes both. Implements every rule from the spec's click / button-enablement tables. _runWithOverlay builds an _inbox.tic_tac_toe.<intent> overlay and calls runtime.spawn(prompt: ..., stateOverlay: ...). Server-state observer auto-promotes viewMode hidden→inline on first game, never demotes fullscreen, and clears pending once the server reflects the user's move. Optional RunRegistry parameter lets the controller observe ALL sessions on the thread (including chat) for the in-fullscreen unread-banner.
  • TicTacToeRegistry + tictactoeRegistryProvider: per-thread controller registry, mirrors the project's existing RunRegistry pattern.
  • TicTacToeAppModule: contributes three overrides — the registry plus the two room slot providers populated with the board / toolbar widgets.
  • UI widgets: TicTacToeCell, TicTacToeControls (Send / Cancel / Undo / Redo / Autosend / Fullscreen + inline error chip with Retry — no SnackBars, per project convention), TicTacToeBoard (inline card with hide affordance + result banner + Play again), TicTacToeFullscreenPage (enlarged board, badge-decorated exit, transient banner on chat-during-fullscreen), TicTacToeToolbarButton (start-or-toggle, also handles the no-thread state by spawning a fresh thread + new game in one tap).

Room module changes

  • roomAboveChatInputBuildersProvider and roomChatInputToolbarBuildersProvider: two Provider<List<WidgetBuilder>> slots that the room screen renders. Default to const empty; modules override via ProviderScope to inject their widgets. ChatInput accepts toolbarExtras and spreads them between the existing attach icon and the text field.
  • roomActiveThreadProvider: publishes (threadKey, runtime) so other modules can attach per-thread controllers without coupling to room internals. Wired by _RoomScreenState._buildContent from _state.activeThreadView + widget.serverEntry.serverId + widget.roomId + _state.runtime.
  • roomSpawnNewThreadProvider: typedef'd callback for surfaces to start a thread from the no-thread state with a custom stateOverlay. Wired to RoomState.sendToNewThread.
  • runRegistryProvider: RunRegistry exposed for cross-module observers (the controller's chat-streaming subscription).

Shared package fix — packages/soliplex_agent

RunOrchestrator._buildConversation now strips wire-only _inbox from the cached baseState before merging the new overlay. The orchestrator was persisting the full wire payload (including _inbox) into ThreadHistory, then replaying it on the next run — so finishing a tic-tac-toe game and sending a chat message would re-issue the previous move's intent against the now-finished board, surfacing as InvalidIntent("game already finished"). Regression test added.

Tests

  • 46 tic-tac-toe unit + widget + integration tests, including the binding round-trip test that pumps a TicTacToeBoard against a fake runtime, taps a cell, taps Send, and asserts both the user's X and the agent's O render after the state delta lands.
  • 1207 frontend tests total, all green. Analyzer 0 issues. Format clean.

Test plan

  • flutter analyze → 0 issues
  • flutter test --reporter failures-only → 1207 passed
  • Backend smoke: soliplex-cli serve against the example installation, GET /api/v1/rooms returns 200 including tic_tac_toe
  • Manual round-trip: tap # icon (no-thread state) → game spawns + agent's pre-played opening move appears → tap a cell → tap Send → agent responds → undo / redo / new-game → talk chat about something unrelated → tic-tac-toe state untouched
  • Manual fullscreen: enter fullscreen → game playable → chat message arriving in another tab surfaces banner + bumps unread badge → exit fullscreen via icon button or hardware back / iOS swipe-back

Out of scope (per the design doc's Out-of-scope section)

Game persistence across reload, multi-player, score tracking, configurable AI difficulty, animations beyond static highlighting, replay viewer. Plus the four follow-up press-on-binding items: multi-surface coordination, high-frequency surface (slider), Surface.emit / bus.events channel, cross-thread bus isolation. Each gets its own design doc when picked up.

🤖 Generated with Claude Code

91jaeminjo and others added 11 commits April 29, 2026 17:11
Two new Provider<List<WidgetBuilder>>s in room_providers.dart, both
defaulting to const []: roomAboveChatInputBuildersProvider (renders
between the message list and the chat input) and
roomChatInputToolbarBuildersProvider (renders as extra icons in the
chat input toolbar). Modules contribute by overriding the providers.

ChatInput accepts toolbarExtras and spreads them between the existing
attach icon and the text field. Both ChatInput call sites in
RoomScreen (no-thread and thread-view) wrap the chat input in a
Consumer that watches the toolbar provider, and prepend a separate
Consumer above the existing banners that renders the above-chat
builders.

Existing room_screen widget tests now pump RoomScreen inside a
ProviderScope (the new Consumers require an ancestor scope).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three pure data files for the new module:

- tic_tac_toe_state.dart: client-only types — Cell, TurnPair, view-mode
  and error enums, and TicTacToeClientState (pending move, redo stack,
  view mode, autosend, in-flight, lastError, fullscreen-banner state).
  copyWith uses explicit clearPending / clearLastError flags so callers
  can distinguish "leave alone" from "set null".

- tic_tac_toe_server_state.dart: TicTacToePlayer, TicTacToeOutcome,
  Move, TicTacToeServerState. All immutable with deep value equality
  (board cells, moves, winning line) so equal-value computeds
  short-circuit at the widget boundary.

- tic_tac_toe_projection.dart: TicTacToeProjection extends
  StateProjection<TicTacToeServerState?>. Tolerant of malformed input —
  returns null on missing/bogus 'game' rather than throwing.

The projection_test pins the rebuild-radius contract: an unrelated
agentState change leaves the projected slice equal (== and hashCode),
so downstream computeds don't fire spuriously.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TicTacToeController owns one game per ThreadKey. Its constructor
projects bus.agentState through TicTacToeProjection into a server
signal, owns a TicTacToeClientState signal for the local pending /
redo / view-mode / autosend / inFlight overlay, and exposes a
boardRender computed that composes both into BoardRenderState — the
single signal the board widget watches.

User actions:
- clickCell: stage / replace / toggle a pending move (autosend fires
  send() immediately when the rule's net effect is a new pending).
- clickUndo: clears pending OR dispatches an undo intent for committed
  history (pops a TurnPair onto the redo stack — last-mover-agent pops
  two moves, last-mover-user pops one).
- clickRedo: pops a TurnPair, dispatches a redo intent with the moves
  payload. Disabled while pending is non-null.
- send: builds an _inbox.tic_tac_toe.play overlay with the pending
  cell, clears the redo stack, runs through _runWithOverlay.
- newGame, cancel, setViewMode, toggleAutoSend.

_runWithOverlay calls runtime.spawn(roomId, prompt: '', threadId,
stateOverlay) using ThreadKey's named-record destructuring, awaits the
session result, and sets lastError = network on failure. A _disposed
flag guards the finally-block writes so dispose() during an active
session doesn't race onto a torn-down signal.

The server-state subscription auto-promotes viewMode hidden→inline on
first game appearance (never demotes fullscreen) and clears pending
when the server reflects the user's move.

BoardRenderState.compose maps the button-enablement table into the
canSend / canCancel / canUndo / canRedo / canNewGame flags so the
widget layer is purely declarative.

TicTacToeIntent centralises the wire keys for the _inbox payload.

Adds meta to pubspec for @VisibleForTesting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TicTacToeRegistry maps ThreadKey → TicTacToeController with
controllerFor (lazy create via factory), disposeFor, and disposeAll.
Mirrors RunRegistry / messageExpansionsProvider's manual-registry
pattern.

tictactoeRegistryProvider throws unless overridden — wiring is the
flavor's responsibility through TicTacToeAppModule.build(), which
contributes three overrides: the registry plus the two room slot
providers (above-chat-input and toolbar) populated with stub board /
toolbar widgets. The full UI lands in Phase 6; this phase just gets
the composition surface visible.

onDispose disposes all owned controllers in reverse, matching the
shell's reverse-registration teardown.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TicTacToeCell renders mark + pending overlay (edit icon) + winning
highlight. TicTacToeControls maps BoardRenderState flags onto
Send/Cancel/Undo/Redo/Autosend/Fullscreen buttons plus an inline
error chip with Retry — replaces the SnackBar approach the codebase
generally avoids.

TicTacToeBoard is now a ConsumerStatefulWidget that resolves the
active thread + runtime + RunRegistry, gets its controller from
TicTacToeRegistry, and watches `boardRender` to render the inline
Card (cells keyed `cell-r-c`, result banner + "Play again" button on
game end, controls below). View-mode transitions push or pop a
MaterialPageRoute for TicTacToeFullscreenPage; the .then((_) {...})
resync handles hardware back / iOS swipe-back so viewMode never
gets stuck in `fullscreen`.

TicTacToeFullscreenPage shows the enlarged board with an exit button
that wears an unread badge, plus a transient banner for chat
messages received while the game is in fullscreen.

Controller now takes an optional RunRegistry. When provided, it
subscribes to activeKeys and pivots to whichever AgentSession is
active for this thread (game runs OR chat). On the transition into
RunningState(streaming: TextStreaming) while viewMode == fullscreen,
it bumps unreadChatWhileFullscreen and shows the banner; the banner
auto-dismisses after 3 seconds. setViewMode clears the unread
counter + banner on leaving fullscreen. dispose() releases the new
subscriptions and the banner timer.

Room module exposes a runRegistryProvider so cross-module observers
can subscribe; RoomAppModule.build() overrides it with its
constructor-injected registry, so the flavor doesn't change.

TicTacToeToolbarButton becomes the real start-or-toggle entry point.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pumps a TicTacToeBoard against a FakeAgentRuntime whose spawn
applies the user's move and a deterministic agent move via
bus.setAgentState. Taps cell (1,1), taps Send, asserts that both
'X' (user, 1,1) and 'O' (agent, 2,2) render in the board after the
state delta lands.

This is THE binding press: tap → controller.send() → stateOverlay →
runtime.spawn → server delta → bus.setAgentState → projection →
boardRender → widget rebuild. If this passes, the protocol seam is
wired correctly end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TicTacToeAppModule joins the standard flavor's module list (after
RoomAppModule + QuizAppModule so its overrides layer on top of the
room module's). The room screen now publishes the active thread
through roomActiveThreadProvider — wrapping _buildContent's tree in
a ProviderScope whose override is recomputed each rebuild from the
serverEntry / roomId / activeThreadView trio plus the room state's
AgentRuntime. Tic-tac-toe widgets (board + toolbar) read this to
attach their per-thread controller; when no thread is active they
render nothing.

Last piece needed for the manual smoke test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related UX improvements around the # icon:

1) No-thread state. The toolbar button used to early-return
SizedBox.shrink whenever roomActiveThreadProvider was null, which is
the default state of a freshly-entered room. The button only became
visible after the user had already typed something and spawned a
thread some other way. Now the no-thread branch reads a new
roomSpawnNewThreadProvider — wired by the room screen as a thunk
over RoomState.sendToNewThread — and uses it to spawn a fresh thread
with the new_game inbox payload in one click. The room module
publishes the provider; the room screen overrides it inside the
existing ProviderScope alongside roomActiveThreadProvider.

2) Meaningful prompts. Game intents used to spawn the run with
prompt: '' so each click produced an empty user-message bubble in
the transcript. The controller now passes a human-readable prompt
that mirrors the backend's _summarize_intent reply voice:
new_game → "Start a new game.", play → "Play (r, c).",
undo → "Undo.", redo → "Redo.". The backend's pre-LLM dispatcher
short-circuits on the inbox so the prompt content has no effect on
game logic — it just makes the chat transcript readable.

_runWithOverlay now takes a required named prompt parameter so
callers must opt into a string. Tests updated for the new send /
no-thread flows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… toggle, drop edit icon

Five small UX fixes following manual smoke-test feedback:

1) Retap-to-cancel pending. The cell widget's enabled rule was
keying off render.cells[r][c].mark, which becomes non-null when a
cell is staged (the pending overlay paints 'X'). That blocked the
spec's "pending == (r,c) → toggle off" rule from ever firing because
the tap couldn't reach the controller. CellRender now exposes a
separate serverMark field (server-board occupancy, ignoring pending
overlay) and the board / fullscreen widgets enable cells based on
that. The pending mark still renders dimmed.

2) Hide-board button on the card itself. The toolbar # icon already
toggled hidden ↔ inline (per spec) but it wasn't discoverable. Added
a small close icon at the top-right of the inline board card that
calls setViewMode(hidden).

3) Fullscreen toggle tooltip + icon now reflect state. TicTacToeControls
gains an isFullscreen flag — when true, the toggle renders
Icons.fullscreen_exit with tooltip 'Exit fullscreen' instead of the
static 'Fullscreen' / Icons.fullscreen pair that had been showing up
even inside the fullscreen page.

4) Unread badge consolidated. Dropped the duplicate exit IconButton
in the top-right corner of the fullscreen page; the badge now wraps
the controls' fullscreen-toggle as the single exit affordance.

5) Pending edit-icon overlay removed. The Icons.edit_outlined dot at
the bottom-right of staged cells was unpressable decoration that
confused users. The faded mark color already conveys "not yet
committed".

Tests updated for new CellRender.serverMark + TicTacToeControls'
new isFullscreen / unreadCount params.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… pattern

The previous Icons.close IconButton at the top-right of the board
read as "destroy" rather than "minimize". Replaced with the
GestureDetector + 'Hide' bodySmall text + Icons.expand_more chevron
already used by chat_input.dart's docs-chips collapse control. Same
visual language across the room module; the action is unchanged
(controller.setViewMode(hidden)).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… before replay

The orchestrator persists conversation.aguiState (wire payload) into
ThreadHistory and replays it as the next run's baseState. The wire
convention is that \`_inbox\` is per-run only — surfaces (e.g.
tic-tac-toe) put intent payloads there for the agent to consume; the
server never echoes them back via state deltas, so the bus never
carries them. But because the orchestrator was capturing the wire
payload as-is, a stale \`_inbox\` would replay on the next spawn.

Concrete bug surface: finishing a tic-tac-toe game then sending a
chat message triggered the agent's dispatcher to re-call play_move
with the previous play's coordinates. With \`game.winner != null\`,
play_move raises InvalidIntent("game already finished") and the chat
send appears to fail.

Fix: in _buildConversation, drop \`_inbox\` from the cached baseState
before merging the new stateOverlay. Cached history retains fidelity
to what was sent; replay just doesn't carry transient intents.
Regression test added.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@91jaeminjo 91jaeminjo marked this pull request as draft April 30, 2026 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant