bubbuild · frostming · Apr 10, 2026 · Apr 9, 2026 · Apr 9, 2026 · Apr 9, 2026
diff --git a/docs/architecture.md b/docs/architecture.md
@@ -16,11 +16,14 @@
 2. Initialize state with `_runtime_workspace` from `BubFramework.workspace`.
 3. Merge all `load_state(message, session_id)` dicts.
 4. Build prompt via `build_prompt(message, session_id, state)` (fallback to inbound `content` if empty).
-5. Execute `run_model(prompt, session_id, state)`.
-6. Always execute `save_state(...)` in a `finally` block.
-7. Render outbound batches via `render_outbound(...)`, then flatten them.
-8. If no outbound exists, emit one fallback outbound.
-9. Dispatch each outbound via `dispatch_outbound(message)`.
+5. Execute `run_model_stream(prompt, session_id, state)`.
+6. For each stream event, call `OutboundChannelRouter.dispatch_event(...)`, which forwards to `channel.on_event(event, message)` when the target channel exists.
+7. Always execute `save_state(...)` in a `finally` block.
+8. Render outbound batches via `render_outbound(...)`, then flatten them.
+9. If no outbound exists, emit one fallback outbound.
+10. Dispatch each outbound via `dispatch_outbound(message)`.
+
+If no plugin implements `run_model_stream`, `HookRuntime` falls back to `run_model(prompt, session_id, state)` and adapts the returned text into a stream with a single text chunk.
 
 ## Hook Priority Semantics
 
@@ -47,12 +50,23 @@
 Builtin `BuiltinImpl` behavior includes:
 
 - `build_prompt`: supports comma command mode; non-command text may include `context_str`.
-- `run_model`: delegates to `Agent.run()`.
+- `run_model_stream`: delegates to `Agent.run()`.
 - `system_prompt`: combines a default prompt with workspace `AGENTS.md`.
 - `register_cli_commands`: installs `run`, `gateway`, `chat`, plus hidden diagnostic commands.
 - `provide_channels`: returns `telegram` and `cli` channel adapters.
 - `provide_tape_store`: returns a file-backed tape store under `~/.bub/tapes`.
 
+## Channel Event Streaming
+
+Channels have two different outbound surfaces:
+
+- `send(message)`: handles the final rendered outbound message.
+- `on_event(event, message)`: handles raw stream events while the model is still running.
+
+`on_event` is optional. Implement it when a channel can benefit from incremental rendering, typing indicators, progress updates, or partial text display. The `message` argument is the original inbound message, so channel implementations usually use it to recover routing metadata such as target channel, chat id, session id, or message kind.
+
+If a channel does not implement any special event behavior, it can ignore `on_event` and rely entirely on `send()`.
+
 ## Boundaries
 
 - `Envelope` stays intentionally weakly typed (`Any` + accessor helpers).

diff --git a/docs/channels/index.md b/docs/channels/index.md
@@ -33,6 +33,17 @@ uv run bub gateway --enable-channel telegram
 - Telegram channel session id: `telegram:<chat_id>`
 - `chat` command default session id: `cli_session` (override with `--session-id`)
 
+## Outbound Delivery Surfaces
+
+Channel adapters can receive outbound data in two forms:
+
+- `send(message)`: the final rendered outbound message
+- `on_event(event, message)`: streaming events emitted while the model is still producing output
+
+Use `on_event` for incremental UX such as live text updates, typing indicators, progress bars, or chunk-level logging. Use `send` for the final durable outbound payload.
+
+`on_event` is optional. A channel that does not need streaming behavior can ignore it and only implement `send`.
+
 ## Debounce Behavior
 
 - `cli` does not debounce; each input is processed immediately.

diff --git a/docs/extension-guide.md b/docs/extension-guide.md
@@ -100,11 +100,18 @@ Current `process_inbound()` hook usage:
 1. `resolve_session` (`call_first`)
 2. `load_state` (`call_many`, then merged by framework)
 3. `build_prompt` (`call_first`)
-4. `run_model` (`call_first`)
+4. `run_model_stream` (`call_first`)
 5. `save_state` (`call_many`, always executed in `finally`)
 6. `render_outbound` (`call_many`)
 7. `dispatch_outbound` (`call_many`, per outbound)
 
+Compatibility note:
+
+- `run_model_stream` is the primary model hook.
+- If no plugin implements `run_model_stream`, Bub falls back to `run_model`.
+- The `run_model` return value is wrapped into a stream with exactly one text chunk.
+- A plugin should implement one of these hooks, not both.
+
 Other hook consumers:
 
 - `register_cli_commands`: called by `call_many_sync`
@@ -150,6 +157,8 @@ class SessionPlugin:
 ```python
 from __future__ import annotations
 
+from republic import AsyncStreamEvents, StreamEvent
+
 from bub import hookimpl
 
 
@@ -159,8 +168,11 @@ class EchoPlugin:
         return f"[echo] {message['content']}"
 
     @hookimpl
-    async def run_model(self, prompt, session_id, state):
-        return prompt
+    async def run_model_stream(self, prompt, session_id, state):
+        async def iterator():
+            yield StreamEvent("text", {"delta": prompt})
+
+        return AsyncStreamEvents(iterator())
 ```
 
 Run and verify:
@@ -170,9 +182,56 @@ uv run bub hooks
 uv run bub run "hello"
 ```
 
-Check that your plugin is listed for `build_prompt` / `run_model`, and output reflects your override.
+Check that your plugin is listed for `build_prompt` / `run_model_stream`, and output reflects your override.
+If you intentionally use the legacy compatibility hook, check for `run_model`.
+
+## 10) Listen To Parent Stream
+
+If you want to observe or transform the parent stream instead of fully replacing it, implement `run_model_stream` and wrap the parent hook's async iterator.
+
+This pattern uses `subset_hook_caller(...)` to call the same hook chain without the current plugin, then returns a new `AsyncStreamEvents` wrapper.
+
+```python
+from __future__ import annotations
+
+from republic import AsyncStreamEvents, StreamEvent
+
+from bub import hookimpl
+
+
+class StreamTapPlugin:
+    def __init__(self, framework) -> None:
+        self.framework = framework
+
+    @hookimpl
+    async def run_model_stream(self, prompt, session_id, state):
+        parent_hook = self.framework._plugin_manager.subset_hook_caller(
+            "run_model_stream",
+            remove_plugins=[self],
+        )
+        parent_stream = await parent_hook(
+            prompt=prompt,
+            session_id=session_id,
+            state=state,
+        )
+        if parent_stream is None:
+            raise RuntimeError("no parent run_model_stream implementation found")
+
+        async def iterator():
+            async for event in parent_stream:
+                if event.kind == "text":
+                    delta = str(event.data.get("delta", ""))
+                    print(delta, end="")
+                yield event
+
+        return AsyncStreamEvents(iterator(), state=parent_stream._state)
+```
+
+Use this when you need to log chunks, redact text, inject extra events, or measure stream timing without reimplementing the underlying model call.
+
+If you also need to support parents that only implement legacy `run_model`, add your own fallback path and wrap that text result into a one-chunk stream.
 
-## 10) Common Pitfalls
+## 11) Common Pitfalls
 
 - Defining `@tool` functions without importing the module from your plugin means the tools never register.
 - Returning awaitables from hooks invoked via sync paths (`call_many_sync` / `call_first_sync`) causes skip.

diff --git a/docs/features.md b/docs/features.md
@@ -5,7 +5,9 @@
 Every turn stage is a [pluggy](https://pluggy.readthedocs.io/) hook.
 Builtins are ordinary plugins — override any stage by registering your own.
 Both first-result hooks (override) and broadcast hooks (observer) are supported.
-Safe fallback to prompt text when `run_model` returns no value (with `on_error` notification).
+`run_model_stream` is the primary model hook.
+Legacy `run_model` hooks still work and are adapted into a single text chunk stream.
+Safe fallback to prompt text when no model hook returns a value (with `on_error` notification).
 Automatic fallback outbound when `render_outbound` produces nothing.
 
 ## Tape-Based Context

diff --git a/docs/index.md b/docs/index.md
@@ -31,11 +31,13 @@ uv run bub gateway                      # channel listener mode
 
 Every inbound message goes through one turn pipeline. Each stage is a hook.
 
+```text
+resolve_session → load_state → build_prompt → run_model_stream
+                                                          ↓
+                     dispatch_outbound ← render_outbound ← save_state
 ```
-resolve_session → load_state → build_prompt → run_model
-                                                   ↓
-              dispatch_outbound ← render_outbound ← save_state
-```
+
+`run_model` remains supported as a compatibility hook and is adapted into a single-chunk stream when `run_model_stream` is absent.
 
 Builtins are plugins registered first. Later plugins override earlier ones. No special cases.