Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
5cdf1d2
chore: add engines field, @types/node, tsx; remove stale spec file
jpr5 Mar 21, 2026
900399b
feat: v1.6.0 — provider endpoints, chaos, metrics, record-and-replay
jpr5 Mar 21, 2026
2773d9b
test: 1250 tests — comprehensive coverage for all v1.6.0 features
jpr5 Mar 21, 2026
402c8fa
docs: v1.6.0 documentation — 6 new pages, update all existing pages
jpr5 Mar 21, 2026
6be3821
chore: bump version to 1.6.0, update Chart.yaml appVersion, add CHANG…
jpr5 Mar 21, 2026
8f14082
fix: type safety — RecordProviderKey, null journal body, exhaustive c…
jpr5 Mar 21, 2026
63e718d
fix: observability — metrics crash guard, Bedrock truncation warning …
jpr5 Mar 21, 2026
3d479ef
docs: correct --strict mode documentation in SKILL.md
jpr5 Mar 21, 2026
de8cfc3
test: cover metrics crash guard and Bedrock CRC truncation
jpr5 Mar 21, 2026
3657cf1
test: add unit tests for drift remediation scripts
jpr5 Mar 21, 2026
3fd1ec1
fix: chaos header validation, range clamping, and disconnect integrat…
jpr5 Mar 21, 2026
65a5b1c
fix: add error handling around metrics instrumentation in response fi…
jpr5 Mar 21, 2026
e454c12
refactor: tighten recorder pipeline typing with RecordProviderKey
jpr5 Mar 21, 2026
7807b1e
feat: validate StreamingProfile and ChaosConfig ranges at fixture loa…
jpr5 Mar 21, 2026
8014b70
docs: correct docker.html errors, add missing endpoints, fix CHANGELO…
jpr5 Mar 21, 2026
72eda7c
fix: structured logger for chaos/stream warnings; EventStream bounds;…
jpr5 Mar 21, 2026
cb09880
test: regression coverage for logger migration, EventStream bounds, b…
jpr5 Mar 21, 2026
8540122
fix: address review — recorder logging, strict fail-fast, chaos valid…
jpr5 Mar 22, 2026
c694c9b
docs: fix endpoint label (Groq not Azure) and metrics port (4010 not …
jpr5 Mar 22, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,14 @@
# @copilotkit/llmock

## 1.6.0

### Minor Changes

- Provider-specific endpoints: dedicated routes for Bedrock (`/model/{modelId}/invoke`), Ollama (`/api/chat`, `/api/generate`), Cohere (`/v2/chat`), and Azure OpenAI deployment-based routing (`/openai/deployments/{id}/chat/completions`)
- Chaos injection: `ChaosConfig` type with `drop`, `malformed`, and `disconnect` actions; supports per-fixture chaos via `chaos` config on each fixture and server-wide chaos via `--chaos-drop`, `--chaos-malformed`, and `--chaos-disconnect` CLI flags
- Metrics: `GET /metrics` endpoint exposing Prometheus text format with request counters and latency histograms per provider and route
- Record-and-replay: `--record` flag and `proxyAndRecord` helper that proxies requests to real LLM APIs, collapses streaming responses, and writes fixture JSON to disk for future playback

## 1.5.1

### Patch Changes
Expand Down
44 changes: 30 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# @copilotkit/llmock [![Unit Tests](https://github.com/CopilotKit/llmock/actions/workflows/test-unit.yml/badge.svg)](https://github.com/CopilotKit/llmock/actions/workflows/test-unit.yml) [![Drift Tests](https://github.com/CopilotKit/llmock/actions/workflows/test-drift.yml/badge.svg)](https://github.com/CopilotKit/llmock/actions/workflows/test-drift.yml) [![npm version](https://img.shields.io/npm/v/@copilotkit/llmock)](https://www.npmjs.com/package/@copilotkit/llmock)

Deterministic mock LLM server for testing. A real HTTP server on a real port — not an in-process interceptor — so every process in your stack (Playwright, Next.js, agent workers, microservices) can point at it via `OPENAI_BASE_URL` / `ANTHROPIC_BASE_URL` and get reproducible, instant responses. Streams SSE in real OpenAI, Claude, Gemini, Bedrock, and Azure API formats, driven entirely by fixtures. Zero runtime dependencies.
Deterministic mock LLM server for testing. A real HTTP server on a real port — not an in-process interceptor — so every process in your stack (Playwright, Next.js, agent workers, microservices) can point at it via `OPENAI_BASE_URL` / `ANTHROPIC_BASE_URL` and get reproducible, instant responses. Streams SSE in real OpenAI, Claude, Gemini, Bedrock, Azure, Vertex AI, Ollama, and Cohere API formats, driven entirely by fixtures. Zero runtime dependencies.

## Quick Start

Expand Down Expand Up @@ -45,7 +45,7 @@ MSW can't intercept any of those calls. llmock can — it's a real server on a r
**Use llmock when:**

- Multiple processes need to hit the same mock (E2E tests, agent frameworks, microservices)
- You want multi-provider SSE format out of the box (OpenAI, Claude, Gemini)
- You want multi-provider SSE format out of the box (OpenAI, Claude, Gemini, Bedrock, Azure, Vertex AI, Ollama, Cohere)
- You prefer defining fixtures as JSON files rather than code
- You need a standalone CLI server

Expand All @@ -72,17 +72,20 @@ MSW can't intercept any of those calls. llmock can — it's a real server on a r

## Features

- **[Multi-provider support](https://llmock.copilotkit.dev/compatible-providers.html)** — [OpenAI Chat Completions](https://llmock.copilotkit.dev/chat-completions.html), [OpenAI Responses](https://llmock.copilotkit.dev/responses-api.html), [Anthropic Claude](https://llmock.copilotkit.dev/claude-messages.html), [Google Gemini](https://llmock.copilotkit.dev/gemini.html), [AWS Bedrock](https://llmock.copilotkit.dev/aws-bedrock.html), [Azure OpenAI](https://llmock.copilotkit.dev/azure-openai.html)
- **[Multi-provider support](https://llmock.copilotkit.dev/compatible-providers.html)** — [OpenAI Chat Completions](https://llmock.copilotkit.dev/chat-completions.html), [OpenAI Responses](https://llmock.copilotkit.dev/responses-api.html), [Anthropic Claude](https://llmock.copilotkit.dev/claude-messages.html), [Google Gemini](https://llmock.copilotkit.dev/gemini.html), [AWS Bedrock](https://llmock.copilotkit.dev/aws-bedrock.html) (streaming + Converse), [Azure OpenAI](https://llmock.copilotkit.dev/azure-openai.html), [Vertex AI](https://llmock.copilotkit.dev/vertex-ai.html), [Ollama](https://llmock.copilotkit.dev/ollama.html), [Cohere](https://llmock.copilotkit.dev/cohere.html)
- **[Embeddings API](https://llmock.copilotkit.dev/embeddings.html)** — OpenAI-compatible embedding responses with configurable dimensions
- **[Structured output / JSON mode](https://llmock.copilotkit.dev/structured-output.html)** — `response_format`, `json_schema`, and function calling
- **[Sequential responses](https://llmock.copilotkit.dev/sequential-responses.html)** — Stateful multi-turn fixtures that return different responses on each call
- **[Streaming physics](https://llmock.copilotkit.dev/streaming-physics.html)** — Configurable `ttft`, `tps`, and `jitter` for realistic timing
- **[WebSocket APIs](https://llmock.copilotkit.dev/websocket.html)** — OpenAI Responses WS, Realtime API, and Gemini Live
- **[Error injection](https://llmock.copilotkit.dev/error-injection.html)** — One-shot errors, rate limiting, and provider-specific error formats
- **[Chaos testing](https://llmock.copilotkit.dev/chaos-testing.html)** — Probabilistic failure injection: 500 errors, malformed JSON, mid-stream disconnects
- **[Prometheus metrics](https://llmock.copilotkit.dev/metrics.html)** — Request counts, latencies, and fixture match rates at `/metrics`
- **[Request journal](https://llmock.copilotkit.dev/docs.html)** — Record, inspect, and assert on every request
- **[Fixture validation](https://llmock.copilotkit.dev/fixtures.html)** — Schema validation at load time with `--validate-on-load`
- **CLI with hot-reload** — Standalone server with `--watch` for live fixture editing
- **[Docker + Helm](https://llmock.copilotkit.dev/docker.html)** — Container image and Helm chart for CI/CD pipelines
- **Record-and-replay** — VCR-style proxy-on-miss records real API responses as fixtures for deterministic replay
- **[Drift detection](https://llmock.copilotkit.dev/drift-detection.html)** — Daily CI runs against real APIs to catch response format changes
- **Claude Code integration** — `/write-fixtures` skill teaches your AI assistant how to write fixtures correctly

Expand All @@ -92,17 +95,24 @@ MSW can't intercept any of those calls. llmock can — it's a real server on a r
llmock [options]
```

| Option | Short | Default | Description |
| -------------------- | ----- | ------------ | ----------------------------------------- |
| `--port` | `-p` | `4010` | Port to listen on |
| `--host` | `-h` | `127.0.0.1` | Host to bind to |
| `--fixtures` | `-f` | `./fixtures` | Path to fixtures directory or file |
| `--latency` | `-l` | `0` | Latency between SSE chunks (ms) |
| `--chunk-size` | `-c` | `20` | Characters per SSE chunk |
| `--watch` | `-w` | | Watch fixture path for changes and reload |
| `--log-level` | | `info` | Log verbosity: `silent`, `info`, `debug` |
| `--validate-on-load` | | | Validate fixture schemas at startup |
| `--help` | | | Show help |
| Option | Short | Default | Description |
| -------------------- | ----- | ------------ | ------------------------------------------- |
| `--port` | `-p` | `4010` | Port to listen on |
| `--host` | `-h` | `127.0.0.1` | Host to bind to |
| `--fixtures` | `-f` | `./fixtures` | Path to fixtures directory or file |
| `--latency` | `-l` | `0` | Latency between SSE chunks (ms) |
| `--chunk-size` | `-c` | `20` | Characters per SSE chunk |
| `--watch` | `-w` | | Watch fixture path for changes and reload |
| `--log-level` | | `info` | Log verbosity: `silent`, `info`, `debug` |
| `--validate-on-load` | | | Validate fixture schemas at startup |
| `--chaos-drop` | | `0` | Chaos: probability of 500 errors (0-1) |
| `--chaos-malformed` | | `0` | Chaos: probability of malformed JSON (0-1) |
| `--chaos-disconnect` | | `0` | Chaos: probability of disconnect (0-1) |
| `--metrics` | | | Enable Prometheus metrics at /metrics |
| `--record` | | | Record mode: proxy unmatched to real APIs |
| `--strict` | | | Strict mode: fail on unmatched requests |
| `--provider-*` | | | Upstream URL per provider (with `--record`) |
| `--help` | | | Show help |

```bash
# Start with bundled example fixtures
Expand All @@ -113,6 +123,12 @@ llmock -p 8080 -f ./my-fixtures

# Simulate slow responses
llmock --latency 100 --chunk-size 5

# Record mode: proxy unmatched requests to real APIs and save as fixtures
llmock --record --provider-openai https://api.openai.com --provider-anthropic https://api.anthropic.com

# Strict mode in CI: fail if any request doesn't match a fixture
llmock --strict -f ./fixtures
```

## Documentation
Expand Down
2 changes: 1 addition & 1 deletion charts/llmock/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ name: llmock
description: Deterministic mock LLM server for testing (OpenAI, Anthropic, Gemini)
type: application
version: 0.1.0
appVersion: "1.4.0"
appVersion: "1.6.0"
Loading
Loading