Skip to content

0.8.0-dev1: Thalamus — NatsEventBus + cluster + CI plumbing#1

Merged
stevenca merged 1 commit into
mainfrom
feat/0.8.0-dev1-thalamus-nats
Jun 1, 2026
Merged

0.8.0-dev1: Thalamus — NatsEventBus + cluster + CI plumbing#1
stevenca merged 1 commit into
mainfrom
feat/0.8.0-dev1-thalamus-nats

Conversation

@stevenca
Copy link
Copy Markdown
Owner

@stevenca stevenca commented Jun 1, 2026

Summary

First sub-step of the brain-architecture refactor (see docs/architecture/brain.md). Stands up the typed event bus the rest of 0.8.0 will route every signal through.

  • netcortex/thalamus/ — new package with NatsEventBus, a real production implementation of the EventBus Protocol shipped in 0.7.1-dev2. NATS core pub/sub (no JetStream) because the Protocol promises at-least-once-within-session with no replay — which is exactly NATS core's semantics. JetStream is enabled at the server level so future durable consumers (0.9.0 episodic memory, stream bridge for external agents) can opt in via extension methods without redeploying NATS.
  • Helm chart — single-node JetStream-enabled NATS StatefulSet + headless Service + ConfigMap, matching the existing Redis/Neo4j pattern in the chart. PVC-backed /data/jetstream. Liveness probes the listener; readiness asserts JetStream subsystem is up. NATS_URL env threaded into web + worker pods, gated on nats.enabled.
  • Docker composenats service with healthcheck + JetStream + volume; wired into the netcortex / worker services.
  • CIcontracts job gains a NATS service container; NATS_URL env exported. The same contract suite that already covered InMemoryEventBus now parametrizes over both backends and exercises the real production code path.
  • Dependencynats-py>=2.6.

What is NOT in this PR

  • No production code path uses the bus yet. Pollers still call the correlator and writeback directly. The cutover lands in 0.8.0-dev3 (first dual-write) and 0.8.0-dev5 (full cutover).
  • No reflex/ handlers yet — those land in 0.8.0-dev2.

This is intentional: each 0.8.0-devN step lands behind a green CI gate before the next one starts. Easier to bisect any breakage.

Test plan

CI auto-runs:

  • Lint (ruff) — same ratchet as 0.7.1-dev3
  • Unit tests
  • Golden snapshot tests (verifies the brain refactor hasn't changed any of the 78 pinned decisions yet)
  • Contract tests against both InMemoryEventBus AND real NatsEventBus ← the new thing this PR enables
  • Recorded integration replay (offline)
  • Type check (mypy --strict on contracts; best-effort elsewhere)
  • Security lints (no self-rewrite, no unsanitized cassettes)
  • SBOM + pip-audit

Manual verification once merged:

  • helm install in microk8s — NATS StatefulSet comes up, monitoring endpoint reachable at <release>-nats:8222/healthz
  • Web + worker pods see NATS_URL env and start cleanly (no actual NATS traffic yet)

Roadmap

This is sub-step 1 of 6 for 0.8.0. Per docs/architecture/brain.md:

# Branch What lands Status
1 feat/0.8.0-dev1-thalamus-nats NATS infra + NatsEventBus + contract parametrization this PR
2 feat/0.8.0-dev2-reflex-skeleton netcortex/reflex/ skeleton + 3 idle handlers next
3 feat/0.8.0-dev3-first-publisher First poller dual-writes to bus; reflex actually fires
4 feat/0.8.0-dev4-module-renames adapters/*sensory/poll/*, graph/correlate.pyassociation/
5 feat/0.8.0-dev5-cutover Pollers ONLY publish to bus; legacy direct path removed
6 tag 0.8.0 Smoke test on microk8s, tag

Made with Cursor

First sub-step of the brain-architecture refactor. Stands up the
typed event bus the rest of 0.8.0 will route every signal through.
Nothing USES the bus yet — that lands in 0.8.0-dev3 (first dual-write
poller) and 0.8.0-dev5 (full cutover). This commit puts the substrate
in place and verifies it against the same contract suite that already
covered InMemoryEventBus.

See docs/architecture/brain.md for the role of the thalamus and the
rationale for NATS specifically.

CODE

  * netcortex/thalamus/ — new package.
  * nats_bus.py — NatsEventBus implementing EventBus Protocol against
    a real NATS server. NATS core pub/sub (not JetStream) because the
    Protocol promises at-least-once-within-session, no replay — which
    is exactly NATS core. JetStream is enabled at the SERVER level so
    future durable consumers (0.9.0 episodic memory, stream bridge for
    external agents) can opt in via extension methods without redeploy.
  * Lifecycle: sync ctor (matches Callable[[], EventBus] factory
    shape); lazy connect on first publish/subscribe; idempotent close()
    that drains pending publishes before closing the socket.
  * Wire format: JSON-encoded UTF-8 payloads; NATS headers (server
    2.2+) for framing metadata. Malformed payloads surfaced as
    {"_raw": ...} with a warning rather than crashing the consumer.

TESTS

  * NatsEventBus registered in tests/contracts/conftest.py. The full
    contract suite (publish/subscribe roundtrip, wildcard filtering,
    no-replay, independent subscribers, invalid-subject rejection,
    invalid-payload rejection, idempotent close) now runs against the
    real NATS backend in addition to InMemoryEventBus.
  * NATS_URL env-gated: when unset the parametrized NATS cases skip
    (so devs without a local broker can still run the suite); when set
    the same cases exercise the production code path. CI always sets it.

INFRASTRUCTURE

  * deploy/helm/templates/{statefulset,service,configmap}-nats.yaml —
    single-node JetStream-enabled NATS StatefulSet matching the
    existing Redis/Neo4j pattern. Headless ClusterIP Service for
    stable DNS; ConfigMap-driven nats.conf; PVC-backed
    /data/jetstream. Liveness probes the listener; readiness asserts
    JetStream subsystem is up.
  * values.yaml — nats: block (enabled by default, 2.11-alpine, 2Gi
    PVC, Redis-class resources). HA clustering explicitly deferred to
    a later 0.8.x patch.
  * _helpers.tpl — netcortex.natsUrl template consistent with
    netcortex.{redisUrl,neo4jUri}.
  * deployment-{web,worker}.yaml — NATS_URL env threaded into both
    pods, gated on nats.enabled.
  * Chart.yaml — version 0.1.0 -> 0.2.0, appVersion 0.6.0 ->
    0.8.0-dev1.

LOCAL DEV

  * docker-compose.yml — NATS service with JetStream enabled,
    monitoring port exposed, healthcheck against /healthz. NATS_URL
    wired into netcortex and worker containers.

CI

  * .github/workflows/ci.yaml — contracts job gains a NATS service
    container (nats:2.11-alpine, core pub/sub; JetStream not needed
    for Protocol-surface tests). NATS_URL=nats://localhost:4222
    exported so the gated NATS contract cases actually execute.

DEPENDENCIES

  * nats-py>=2.6 added to runtime deps (async-only client, no native
    code).

Co-authored-by: Cursor <cursoragent@cursor.com>
@stevenca stevenca merged commit f03360e into main Jun 1, 2026
8 checks passed
@stevenca stevenca deleted the feat/0.8.0-dev1-thalamus-nats branch June 1, 2026 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant