Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 13 additions & 5 deletions apps/docs/content/guides/choose-queue.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,33 @@ title: "Choosing a Message Queue on Zerops"
description: "**Use NATS** for most cases (simple, fast, JetStream persistence). Use **Kafka** only for enterprise event streaming with guaranteed ordering and unlimited retention."
---

**Use NATS** for most cases (simple, fast, JetStream persistence). Use **Kafka** only for enterprise event streaming with guaranteed ordering and unlimited retention.
**Use NATS** for most cases (simple, fast, optional JetStream persistence layer when durability is needed). Use **Kafka** only for enterprise event streaming with guaranteed ordering and unlimited retention.

## Decision Matrix

| Need | Choice | Why |
|------|--------|-----|
| **General messaging** | **NATS** (default) | Simple auth, JetStream built-in, fast |
| **General messaging** | **NATS** (default) | Simple auth, fast, JetStream available when needed |
| Enterprise event streaming | Kafka | SASL auth, 3-broker HA, unlimited retention |
| Lightweight pub/sub | NATS | Low overhead, 8MB default messages |
| Lightweight pub/sub | NATS — core | Low overhead, 8MB default messages, fire-and-forget |
| Durable queues, replay, at-least-once | NATS — JetStream | Persistent streams, durable consumers, ack/redeliver |
| Event sourcing / audit logs | Kafka | Indefinite topic retention, strong ordering |

## NATS (Default Choice)

NATS exposes **two distinct messaging shapes**. Pick ONE per recipe and write yaml comments / KB content describing only that shape — mixing them confuses porters about what the recipe actually does.

- **Core pub/sub + queue groups**: `nc.subscribe('subject', { queue: 'workers' })`. No persistence; queue groups load-balance delivery across replicas; lost messages stay lost. HA story: surviving cluster nodes keep delivering, no consumer position to restore. Use when fan-out + load balance + at-most-once is enough.
- **JetStream streams + durable consumers**: opens an explicit stream via `JetStreamManager`, subscribes durably via `js.subscribe(...)`. Persistent message store; replay on reconnect; ack/redeliver. HA story: cluster replicates stream state, acked-but-unprocessed messages survive node loss. Use when at-least-once + replay + persistence are required.

**Authoring rule**: a recipe's yaml comments and KB bullets should reflect the shape the code actually uses. If the worker only calls `nc.subscribe()` with a queue group and never opens a stream, do not invoke JetStream language at HA tiers — the recipe has no stream to replicate. If the worker opens a JetStream stream, the JetStream HA story is the relevant one.

- Ports: 4222 (client), 8222 (HTTP monitoring)
- Auth: user `zerops` + auto-generated password
- **Connection** — two supported patterns, pick ONE:
- **Separate env vars** (recommended, works with every NATS client library): pass `servers: ${hostname}:${port}` plus `user: ${user}, pass: ${password}` as client-side connect options. The servers list stays credential-free.
- **Opaque connection string**: pass `${connectionString}` directly as the servers option — the platform builds a correctly-formatted URL with embedded auth that the NATS server expects.
- JetStream: Enabled by default (`JET_STREAM_ENABLED=1`)
- JetStream capability: enabled by default (`JET_STREAM_ENABLED=1`); recipes opt in by writing JetStream client code. Setting `JET_STREAM_ENABLED=0` hard-disables the capability across the project.
- Storage: Up to 40GB memory + 250GB file store
- Max message: 8MB default, 64MB max (`MAX_PAYLOAD`)
- Health check: `GET /healthz` on port 8222
Expand All @@ -40,6 +48,6 @@ description: "**Use NATS** for most cases (simple, fast, JetStream persistence).
## Gotchas
1. **NATS config changes need restart**: No hot-reload — changing env vars requires service restart
2. **Kafka single-node has no replication**: 1 broker = 3 partitions but zero redundancy
3. **NATS JetStream HA sync interval**: 1-minute sync across nodes — brief data lag possible
3. **NATS JetStream HA sync interval**: 1-minute sync across nodes — brief data lag possible. Applies only to recipes that actually open JetStream streams; core pub/sub recipes are unaffected.
4. **Kafka SASL only**: No anonymous connections — always use the generated credentials
5. **NATS authorization violation from a hand-composed URL**: do not build a `nats://user:pass@host:4222` URL from the separate env vars. Most NATS client libraries will parse the embedded credentials AND separately attempt SASL with the same values, producing a double-auth that the server rejects with `Authorization Violation` on the first CONNECT frame (symptom: startup crash, no successful subscription). Use either the separate env vars passed as connect options (credential-free servers list) or the opaque `${connectionString}` the platform builds for you — both patterns in the Connection section above avoid the double-auth path.
22 changes: 11 additions & 11 deletions apps/docs/content/guides/environment-variables.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ run:
db_hostname: ${db_hostname} # SELF-SHADOW — see next section
db_password: ${db_password} # SELF-SHADOW
queue_hostname: ${queue_hostname} # SELF-SHADOW
STAGE_API_URL: ${STAGE_API_URL} # SELF-SHADOW (project-level variant)
API_URL: ${API_URL} # SELF-SHADOW (project-level variant)
```

The referenced variable does **not** need to exist at definition time — Zerops resolves at container start.
Expand All @@ -108,7 +108,7 @@ run:

At runtime, the worker tries to connect to `"${db_hostname}:5432"` and crashes. The fix is to **delete the entire block** — those vars are already in the container's env without any declaration.

This applies identically to project-level vars (`${STAGE_API_URL}`, `${APP_SECRET}`) and cross-service vars (`${db_hostname}`, `${queue_user}`) — both auto-propagate, both self-shadow under the same rule.
This applies identically to project-level vars (`${API_URL}`, `${APP_SECRET}`) and cross-service vars (`${db_hostname}`, `${queue_user}`) — both auto-propagate, both self-shadow under the same rule.

**Hostname transformation**: dashes become underscores. Service `my-db` variable `port` is `${my_db_port}`.

Expand Down Expand Up @@ -148,30 +148,30 @@ Project variables are **automatically available in every service, in both runtim
```yaml
build:
buildCommands:
- echo "building for $STAGE_API_URL" # shell reads the OS env var
- VITE_API_URL=$STAGE_API_URL npm run build # or pass it forward by shell prefix
- echo "building for $API_URL" # shell reads the OS env var
- VITE_API_URL=$API_URL npm run build # or pass it forward by shell prefix
```

**In `build.envVariables` YAML** (to compose a derived var that the bundler consumes) reference the project var directly without prefix:
```yaml
build:
envVariables:
VITE_API_URL: ${STAGE_API_URL} # project var STAGE_API_URL read as-is, NO RUNTIME_ prefix
VITE_API_URL: ${API_URL} # project var API_URL read as-is, NO RUNTIME_ prefix
```

**In `run.envVariables` YAML** (to forward a project var under a framework-conventional name without creating a shadow), reference directly without prefix:
```yaml
run:
envVariables:
FRONTEND_URL: ${STAGE_FRONTEND_URL} # project var STAGE_FRONTEND_URL forwarded as FRONTEND_URL
CORS_ALLOWED_ORIGIN: ${FRONTEND_URL} # project var FRONTEND_URL forwarded under a different name
```

**DO NOT** re-reference an auto-injected variable under its SAME name — that's a self-shadow loop. Applies to BOTH project-level vars AND cross-service vars:

```yaml
envVariables:
PROJECT_NAME: ${PROJECT_NAME} # project-level self-shadow
STAGE_API_URL: ${STAGE_API_URL} # project-level self-shadow
API_URL: ${API_URL} # project-level self-shadow
db_hostname: ${db_hostname} # cross-service self-shadow
queue_user: ${queue_user} # cross-service self-shadow
```
Expand All @@ -192,11 +192,11 @@ Dual-runtime recipes (frontend SPA + backend API on the same platform) use proje
```yaml
project:
envVariables:
STAGE_API_URL: https://apistage-${zeropsSubdomainHost}-3000.prg1.zerops.app
STAGE_FRONTEND_URL: https://appstage-${zeropsSubdomainHost}.prg1.zerops.app
API_URL: https://apistage-${zeropsSubdomainHost}-3000.prg1.zerops.app
FRONTEND_URL: https://appstage-${zeropsSubdomainHost}.prg1.zerops.app
```

The platform resolves `${zeropsSubdomainHost}` when injecting the value into services at container start. The frontend consumes `STAGE_API_URL` via plain `${STAGE_API_URL}` in `build.envVariables` (baking it into the bundle at compile time) — **no `RUNTIME_` prefix**. The API consumes `STAGE_FRONTEND_URL` via plain `${STAGE_FRONTEND_URL}` in `run.envVariables` (for CORS allow-list). The same names must be set on the workspace project via `zerops_env project=true action=set` after provision, so workspace verification doesn't see literal `${STAGE_FRONTEND_URL}` strings.
The platform resolves `${zeropsSubdomainHost}` when injecting the value into services at container start. The frontend consumes `API_URL` via plain `${API_URL}` in `build.envVariables` (baking it into the bundle at compile time) — **no `RUNTIME_` prefix**. The API consumes `FRONTEND_URL` via plain `${FRONTEND_URL}` in `run.envVariables` (for CORS allow-list). The same names must be set on the workspace project via `zerops_env project=true action=set` after provision, so workspace verification doesn't see literal `${FRONTEND_URL}` strings.

## Secret Variables

Expand Down Expand Up @@ -255,7 +255,7 @@ Zerops auto-generates variables per service (e.g., `hostname`, `PATH`, DB connec

## Common Mistakes

- **DO NOT** re-reference auto-injected vars under their own name — self-shadow loop. Applies to BOTH project-level (`STAGE_API_URL: ${STAGE_API_URL}`) AND cross-service (`db_hostname: ${db_hostname}`, `queue_user: ${queue_user}`).
- **DO NOT** re-reference auto-injected vars under their own name — self-shadow loop. Applies to BOTH project-level (`API_URL: ${API_URL}`) AND cross-service (`db_hostname: ${db_hostname}`, `queue_user: ${queue_user}`).
- **DO NOT** declare cross-service vars you only want to READ — they are already in the container's OS env. Read via `process.env.db_hostname` / `getenv('db_hostname')` directly. Declare in `run.envVariables` only to RENAME (e.g. `DB_HOST: ${db_hostname}`) or to set mode flags.
- **DO NOT** forget restart after GUI/API env changes — process won't see new values
- **DO NOT** expect `envReplace` to recurse subdirectories — it does not
Expand Down
59 changes: 59 additions & 0 deletions apps/docs/content/guides/verify-web-agent-protocol.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
---
title: Verify Web Agent Protocol
description: "Guide: Verify Web Agent Protocol"
---

Sub-agent dispatch protocol for end-to-end verification of a Zerops web
service. The main agent reads `develop-verify-matrix` (atom) for which
services need this protocol; the protocol body itself lives here so it
ships only when fetched, not on every per-turn payload.

Spawn one sub-agent per web-facing target. Substitute `{targetHostname}`
and `{runtime}` with that service's values when constructing the prompt.

---

## Sub-agent dispatch prompt

```
Agent(model="sonnet", prompt="""
Verify Zerops service "{targetHostname}" ({runtime}) works for end users.

## Protocol
1. `zerops_verify serviceHostname="{targetHostname}"` — infrastructure baseline
2. If NOT healthy → VERDICT: FAIL (cite failed checks from zerops_verify response)
3. `zerops_discover service="{targetHostname}"` — get subdomainUrl or connection info
4. Determine reachable URL:
- subdomainUrl available → use it (public HTTPS)
- no subdomain, no custom domain → VERDICT: UNCERTAIN (cannot reach from outside)
- unreachable after timeout → VERDICT: UNCERTAIN
5. `agent-browser open {url}`
6. `agent-browser snapshot` — accessibility tree for AI analysis
7. Evaluate: does the page render meaningful content?
- Interactive elements (buttons, links, forms)?
- Text content (headings, paragraphs)?
- Or empty/broken (empty root div, error page, blank screen)?
8. If concerns: `agent-browser eval "JSON.stringify(Array.from(document.querySelectorAll('script[src]')).map(s=>s.src))"` for loaded scripts
9. For SPAs: `agent-browser eval "window.__errors || []"` AND check if console has errors

## Rules
- zerops_verify unhealthy/degraded → always VERDICT: FAIL (never override infra checks)
- HTTP 401/403 with rendered content (login page, auth challenge) → VERDICT: PASS (auth is working correctly)
- HTTP 401/403 with empty body → VERDICT: UNCERTAIN (cannot determine if intentional)
- zerops_verify healthy + page empty/broken → VERDICT: FAIL (cite what you see)
- zerops_verify healthy + page renders real content → VERDICT: PASS
- agent-browser unavailable or URL unreachable → VERDICT: UNCERTAIN

## Output (mandatory format)
### Infrastructure
zerops_verify status and check summary

### Application
what you observed — DOM content, JS errors, visual state

### Evidence
accessibility tree excerpt or error details

### VERDICT: PASS or FAIL or UNCERTAIN — one-line justification
""")
```
Loading