Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 13 additions & 5 deletions apps/docs/content/guides/choose-queue.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,33 @@ title: "Choosing a Message Queue on Zerops"
description: "**Use NATS** for most cases (simple, fast, JetStream persistence). Use **Kafka** only for enterprise event streaming with guaranteed ordering and unlimited retention."
---

**Use NATS** for most cases (simple, fast, JetStream persistence). Use **Kafka** only for enterprise event streaming with guaranteed ordering and unlimited retention.
**Use NATS** for most cases (simple, fast, optional JetStream persistence layer when durability is needed). Use **Kafka** only for enterprise event streaming with guaranteed ordering and unlimited retention.

## Decision Matrix

| Need | Choice | Why |
|------|--------|-----|
| **General messaging** | **NATS** (default) | Simple auth, JetStream built-in, fast |
| **General messaging** | **NATS** (default) | Simple auth, fast, JetStream available when needed |
| Enterprise event streaming | Kafka | SASL auth, 3-broker HA, unlimited retention |
| Lightweight pub/sub | NATS | Low overhead, 8MB default messages |
| Lightweight pub/sub | NATS — core | Low overhead, 8MB default messages, fire-and-forget |
| Durable queues, replay, at-least-once | NATS — JetStream | Persistent streams, durable consumers, ack/redeliver |
| Event sourcing / audit logs | Kafka | Indefinite topic retention, strong ordering |

## NATS (Default Choice)

NATS exposes **two distinct messaging shapes**. Pick ONE per recipe and write yaml comments / KB content describing only that shape — mixing them confuses porters about what the recipe actually does.

- **Core pub/sub + queue groups**: `nc.subscribe('subject', { queue: 'workers' })`. No persistence; queue groups load-balance delivery across replicas; lost messages stay lost. HA story: surviving cluster nodes keep delivering, no consumer position to restore. Use when fan-out + load balance + at-most-once is enough.
- **JetStream streams + durable consumers**: opens an explicit stream via `JetStreamManager`, subscribes durably via `js.subscribe(...)`. Persistent message store; replay on reconnect; ack/redeliver. HA story: cluster replicates stream state, acked-but-unprocessed messages survive node loss. Use when at-least-once + replay + persistence are required.

**Authoring rule**: a recipe's yaml comments and KB bullets should reflect the shape the code actually uses. If the worker only calls `nc.subscribe()` with a queue group and never opens a stream, do not invoke JetStream language at HA tiers — the recipe has no stream to replicate. If the worker opens a JetStream stream, the JetStream HA story is the relevant one.

- Ports: 4222 (client), 8222 (HTTP monitoring)
- Auth: user `zerops` + auto-generated password
- **Connection** — two supported patterns, pick ONE:
- **Separate env vars** (recommended, works with every NATS client library): pass `servers: ${hostname}:${port}` plus `user: ${user}, pass: ${password}` as client-side connect options. The servers list stays credential-free.
- **Opaque connection string**: pass `${connectionString}` directly as the servers option — the platform builds a correctly-formatted URL with embedded auth that the NATS server expects.
- JetStream: Enabled by default (`JET_STREAM_ENABLED=1`)
- JetStream capability: enabled by default (`JET_STREAM_ENABLED=1`); recipes opt in by writing JetStream client code. Setting `JET_STREAM_ENABLED=0` hard-disables the capability across the project.
- Storage: Up to 40GB memory + 250GB file store
- Max message: 8MB default, 64MB max (`MAX_PAYLOAD`)
- Health check: `GET /healthz` on port 8222
Expand All @@ -40,6 +48,6 @@ description: "**Use NATS** for most cases (simple, fast, JetStream persistence).
## Gotchas
1. **NATS config changes need restart**: No hot-reload — changing env vars requires service restart
2. **Kafka single-node has no replication**: 1 broker = 3 partitions but zero redundancy
3. **NATS JetStream HA sync interval**: 1-minute sync across nodes — brief data lag possible
3. **NATS JetStream HA sync interval**: 1-minute sync across nodes — brief data lag possible. Applies only to recipes that actually open JetStream streams; core pub/sub recipes are unaffected.
4. **Kafka SASL only**: No anonymous connections — always use the generated credentials
5. **NATS authorization violation from a hand-composed URL**: do not build a `nats://user:pass@host:4222` URL from the separate env vars. Most NATS client libraries will parse the embedded credentials AND separately attempt SASL with the same values, producing a double-auth that the server rejects with `Authorization Violation` on the first CONNECT frame (symptom: startup crash, no successful subscription). Use either the separate env vars passed as connect options (credential-free servers list) or the opaque `${connectionString}` the platform builds for you — both patterns in the Connection section above avoid the double-auth path.
59 changes: 59 additions & 0 deletions apps/docs/content/guides/verify-web-agent-protocol.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
---
title: Verify Web Agent Protocol
description: "Guide: Verify Web Agent Protocol"
---

Sub-agent dispatch protocol for end-to-end verification of a Zerops web
service. The main agent reads `develop-verify-matrix` (atom) for which
services need this protocol; the protocol body itself lives here so it
ships only when fetched, not on every per-turn payload.

Spawn one sub-agent per web-facing target. Substitute `{targetHostname}`
and `{runtime}` with that service's values when constructing the prompt.

---

## Sub-agent dispatch prompt

```
Agent(model="sonnet", prompt="""
Verify Zerops service "{targetHostname}" ({runtime}) works for end users.

## Protocol
1. `zerops_verify serviceHostname="{targetHostname}"` — infrastructure baseline
2. If NOT healthy → VERDICT: FAIL (cite failed checks from zerops_verify response)
3. `zerops_discover service="{targetHostname}"` — get subdomainUrl or connection info
4. Determine reachable URL:
- subdomainUrl available → use it (public HTTPS)
- no subdomain, no custom domain → VERDICT: UNCERTAIN (cannot reach from outside)
- unreachable after timeout → VERDICT: UNCERTAIN
5. `agent-browser open {url}`
6. `agent-browser snapshot` — accessibility tree for AI analysis
7. Evaluate: does the page render meaningful content?
- Interactive elements (buttons, links, forms)?
- Text content (headings, paragraphs)?
- Or empty/broken (empty root div, error page, blank screen)?
8. If concerns: `agent-browser eval "JSON.stringify(Array.from(document.querySelectorAll('script[src]')).map(s=>s.src))"` for loaded scripts
9. For SPAs: `agent-browser eval "window.__errors || []"` AND check if console has errors

## Rules
- zerops_verify unhealthy/degraded → always VERDICT: FAIL (never override infra checks)
- HTTP 401/403 with rendered content (login page, auth challenge) → VERDICT: PASS (auth is working correctly)
- HTTP 401/403 with empty body → VERDICT: UNCERTAIN (cannot determine if intentional)
- zerops_verify healthy + page empty/broken → VERDICT: FAIL (cite what you see)
- zerops_verify healthy + page renders real content → VERDICT: PASS
- agent-browser unavailable or URL unreachable → VERDICT: UNCERTAIN

## Output (mandatory format)
### Infrastructure
zerops_verify status and check summary

### Application
what you observed — DOM content, JS errors, visual state

### Evidence
accessibility tree excerpt or error details

### VERDICT: PASS or FAIL or UNCERTAIN — one-line justification
""")
```
Loading