test: add otel-verify harness for OTLP backend smoke tests#525
test: add otel-verify harness for OTLP backend smoke tests#525tombeckenham wants to merge 6 commits intoTanStack:feat/otel-middlewarefrom
Conversation
…ainst OTLP backends Boots in-process aimock and runs three deterministic chat scenarios (basic-text, with-tool, error) through otelMiddleware, exporting via OTLP/HTTP. Backend is selected by the OTEL_BACKEND env var; presets cover Jaeger, Phoenix, Langfuse, PostHog, Sentry, Logfire, Traceloop, Datadog, and Helicone. docker-compose.yml provides locally-runnable services for the self-hostable backends. Package is private: true — never published — and is intended as a one-shot manual verification tool, not CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a new OpenTelemetry middleware and related docs, tests, E2E capture, and a manual OTLP verification harness. The middleware instruments chat(), iteration, and tool lifecycles with spans and histograms, supports optional content capture/redaction and extension hooks, and is exported at ChangesOpenTelemetry Middleware Feature
Manual OpenTelemetry Verification Harness
Sequence DiagramsequenceDiagram
participant Client
participant ChatRuntime
participant OTelMiddleware
participant SpanMetricAPI
participant UserCallbacks
Client->>ChatRuntime: chat(messages)
ChatRuntime->>OTelMiddleware: onStart(ctx)
OTelMiddleware->>UserCallbacks: spanNameFormatter(kind="chat")
OTelMiddleware->>SpanMetricAPI: tracer.startSpan("chat")
OTelMiddleware->>UserCallbacks: onBeforeSpanStart(chat span)
loop per iteration
ChatRuntime->>OTelMiddleware: onConfig(beforeModel)
OTelMiddleware->>SpanMetricAPI: tracer.startSpan("iteration", {parent: chat})
OTelMiddleware->>UserCallbacks: attributeEnricher(kind="iteration")
ChatRuntime->>OTelMiddleware: onChunk(delta)
OTelMiddleware->>OTelMiddleware: buffer assistant text / emit events
ChatRuntime->>OTelMiddleware: onUsage(usage)
OTelMiddleware->>SpanMetricAPI: meter.record("gen_ai.client.token.usage")
loop per tool call
ChatRuntime->>OTelMiddleware: onBeforeToolCall(toolCall)
OTelMiddleware->>SpanMetricAPI: tracer.startSpan("tool", {parent: iteration})
OTelMiddleware->>UserCallbacks: attributeEnricher(kind="tool")
ChatRuntime->>OTelMiddleware: onAfterToolCall(result)
OTelMiddleware->>SpanMetricAPI: span.recordException() / span.setStatus()
OTelMiddleware->>SpanMetricAPI: span.end()
end
end
ChatRuntime->>OTelMiddleware: onFinish
OTelMiddleware->>SpanMetricAPI: meter.record("gen_ai.client.operation.duration")
OTelMiddleware->>UserCallbacks: onSpanEnd(chat span)
OTelMiddleware->>SpanMetricAPI: chat span.end()
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
|
View your CI Pipeline Execution ↗ for commit 91e4a87
☁️ Nx Cloud last updated this comment at |
@tanstack/ai
@tanstack/ai-anthropic
@tanstack/ai-client
@tanstack/ai-code-mode
@tanstack/ai-code-mode-skills
@tanstack/ai-devtools-core
@tanstack/ai-elevenlabs
@tanstack/ai-event-client
@tanstack/ai-fal
@tanstack/ai-gemini
@tanstack/ai-grok
@tanstack/ai-groq
@tanstack/ai-isolate-cloudflare
@tanstack/ai-isolate-node
@tanstack/ai-isolate-quickjs
@tanstack/ai-ollama
@tanstack/ai-openai
@tanstack/ai-openrouter
@tanstack/ai-preact
@tanstack/ai-react
@tanstack/ai-react-ui
@tanstack/ai-solid
@tanstack/ai-solid-ui
@tanstack/ai-svelte
@tanstack/ai-vue
@tanstack/ai-vue-ui
@tanstack/preact-ai-devtools
@tanstack/react-ai-devtools
@tanstack/solid-ai-devtools
commit: |
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/typescript/ai/src/middlewares/otel.ts`:
- Around line 221-226: The redactContent function currently logs the full thrown
error object (console.warn('[otelMiddleware] otel.redact failed', err)) which
may leak sensitive text; change this to avoid printing the error instance —
e.g., log a constant message or at most the error name only — then return
REDACTION_FAILED_SENTINEL; update the console.warn call in redactContent (the
otel.redact usage) to remove the error object from the log.
In `@testing/e2e/src/routes/api.middleware-test.ts`:
- Around line 11-22: The `@opentelemetry/api` type import block (types like
AttributeValue, Attributes, Context, Histogram, Meter, MetricOptions, Span,
SpanContext, SpanStatus, Tracer) is out of order and triggers import/order;
relocate this entire import statement so it sits in the third‑party/external
imports group that eslint expects (i.e., alongside other node_modules imports)
rather than its current position, preserving the same named type list and
maintaining it as a type-only import.
- Around line 81-97: The fake tracer's startSpan currently ignores the passed
parent Context (_ctx) and recordOtelSpan/CapturedSpan have no parent field, so
span hierarchy can't be tested; add an optional parentSpanId?: string to the
CapturedSpan interface, extract the parent span ID from the incoming _ctx inside
startSpan (use the OpenTelemetry context/span helpers to get the active span and
its spanContext().spanId), and include that parentSpanId when calling
recordOtelSpan so tests can observe span nesting; update any consumers/mocks
expecting CapturedSpan accordingly.
In `@testing/otel-verify/docker-compose.yml`:
- Around line 10-12: Comments claim images are pinned but several services use
floating tags (e.g., image tags "16-alpine", "3", "latest"); change those
floating tags in docker-compose.yml to explicit, immutable tags (replace
"16-alpine", "3", "latest" with concrete versioned tags like "16.5-alpine",
"3.12.8", "1.2.3" or whatever exact upstream tag you tested), update the top
comment to reflect the exact pinned-tag policy, and ensure the same fixes are
applied at the other occurrences called out in the review (around the other
image lines mentioned).
In `@testing/otel-verify/src/backends.ts`:
- Around line 154-160: The Datadog backend config (the datadog object: endpoint
and headers functions) must use the OTLP-compatible URL and headers; change
endpoint() to use "https://otlp-trace.{DD_SITE}/v1/traces" and update headers()
to return 'dd-api-key' and 'dd-otlp-source' (use envOrThrow('DD_API_KEY') and
envOrThrow('DD_OTLP_SOURCE') to ensure both are provided). Ensure you keep using
process.env.DD_SITE fallback as before (e.g., 'datadoghq.com') when constructing
the new OTLP endpoint.
In `@testing/otel-verify/src/index.ts`:
- Line 16: The import specifiers for the OpenTelemetry API are out of order and
trigger eslint's sort-imports rule; reorder the named imports in the import
statement so the specifiers are alphabetically sorted (e.g., place 'metrics'
before 'trace') in the import from '@opentelemetry/api' to satisfy linting.
- Around line 91-103: The loop currently treats every successful run as "ok"
even for the special scenario with id === 'error' which is expected to throw;
update the logic in the for-loop over scenarios (the code that calls
scenario.run(tracer, meter) and writes "ok") so that when scenario.id ===
'error' and the call does not throw you mark it as a failure: print a clear
failure message and set exitCode = 1 (same as in the catch block), otherwise
keep printing "ok" for true successes; adjust only the success path around
scenario.run to check scenario.id and handle the expected-error case.
- Around line 65-87: The README claims the harness "exports spans + metrics over
OTLP/HTTP" but the runtime only wires a trace exporter (NodeSDK with
OTLPTraceExporter) and uses a no-op meter (metrics.getMeter) for middleware;
either update documentation to remove/clarify metrics export from the overview
and verification checklist and change the final output text to "traces sent"
only, or implement a proper metrics pipeline by creating and wiring a metrics
exporter (e.g., OTLP metrics exporter / MeterProvider) and configuring the SDK
(or a separate MeterProvider) so the middleware receives a real meter instead of
the no-op meter; locate NodeSDK, OTLPTraceExporter, meter (metrics.getMeter) and
the README verification checklist to apply the chosen fix.
In `@testing/otel-verify/src/scenarios.ts`:
- Around line 131-142: The empty catch swallows all failures; change it to only
swallow the expected synthetic error by catching the thrown error object (from
the chat() stream/drain flow using makeAdapter(), makeOtelMiddleware(tracer,
meter), explode) and if its message (or a distinguishing property) does not
include or equal "synthetic verify-otel error" rethrow it; otherwise
return/ignore to allow the harness to continue. Ensure this check is applied in
the catch block that surrounds the chat(...)/await drain(stream) sequence so
transport/setup/provider exceptions are propagated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f18b3590-82ba-4a43-b751-b75538ab329c
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (22)
.changeset/otel-middleware.mddocs/advanced/otel.mddocs/config.jsonknip.jsonpackages/typescript/ai/package.jsonpackages/typescript/ai/src/middlewares/index.tspackages/typescript/ai/src/middlewares/otel.tspackages/typescript/ai/tests/middlewares/fake-otel.tspackages/typescript/ai/tests/middlewares/otel.test.tspackages/typescript/ai/vite.config.tstesting/e2e/package.jsontesting/e2e/src/lib/otel-capture.tstesting/e2e/src/routes/api.middleware-test.tstesting/e2e/src/routes/middleware-test.tsxtesting/e2e/tests/middleware.spec.tstesting/otel-verify/README.mdtesting/otel-verify/docker-compose.ymltesting/otel-verify/package.jsontesting/otel-verify/src/backends.tstesting/otel-verify/src/index.tstesting/otel-verify/src/scenarios.tstesting/otel-verify/tsconfig.json
| const redactContent = (text: string): string => { | ||
| try { | ||
| return redact(text) | ||
| } catch (err) { | ||
| console.warn('[otelMiddleware] otel.redact failed', err) | ||
| return REDACTION_FAILED_SENTINEL |
There was a problem hiding this comment.
Avoid logging the thrown redactor error object.
If a redactor throws with the original prompt in its message or stack, console.warn(..., err) leaks exactly the content this fail-closed path is supposed to protect. Log a constant warning, or at most the error name, before returning the sentinel.
Suggested fix
const redactContent = (text: string): string => {
try {
return redact(text)
} catch (err) {
- console.warn('[otelMiddleware] otel.redact failed', err)
+ console.warn(
+ '[otelMiddleware] otel.redact failed',
+ err instanceof Error ? err.name : 'unknown',
+ )
return REDACTION_FAILED_SENTINEL
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/typescript/ai/src/middlewares/otel.ts` around lines 221 - 226, The
redactContent function currently logs the full thrown error object
(console.warn('[otelMiddleware] otel.redact failed', err)) which may leak
sensitive text; change this to avoid printing the error instance — e.g., log a
constant message or at most the error name only — then return
REDACTION_FAILED_SENTINEL; update the console.warn call in redactContent (the
otel.redact usage) to remove the error object from the log.
| import type { | ||
| AttributeValue, | ||
| Attributes, | ||
| Context, | ||
| Histogram, | ||
| Meter, | ||
| MetricOptions, | ||
| Span, | ||
| SpanContext, | ||
| SpanStatus, | ||
| Tracer, | ||
| } from '@opentelemetry/api' |
There was a problem hiding this comment.
Reorder the OTel type import to satisfy import/order.
This import block is currently lint-red; move the @opentelemetry/api type import to the position eslint expects.
🧰 Tools
🪛 ESLint
[error] 11-22: @opentelemetry/api type import should occur after import of zod
(import/order)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@testing/e2e/src/routes/api.middleware-test.ts` around lines 11 - 22, The
`@opentelemetry/api` type import block (types like AttributeValue, Attributes,
Context, Histogram, Meter, MetricOptions, Span, SpanContext, SpanStatus, Tracer)
is out of order and triggers import/order; relocate this entire import statement
so it sits in the third‑party/external imports group that eslint expects (i.e.,
alongside other node_modules imports) rather than its current position,
preserving the same named type list and maintaining it as a type-only import.
| startSpan(name, options = {}, _ctx?: Context): Span { | ||
| const id = `span-${spanSeq++}` | ||
| const attrs: Record<string, AttributeValue> = {} | ||
| for (const [k, v] of Object.entries(options.attributes ?? {})) { | ||
| if (v !== undefined) attrs[k] = v as AttributeValue | ||
| } | ||
| recordOtelSpan(captureId, { | ||
| id, | ||
| name, | ||
| kind: options.kind, | ||
| attributes: attrs, | ||
| status: SpanStatusCode.UNSET, | ||
| events: [], | ||
| exceptions: [], | ||
| ended: false, | ||
| }) | ||
| const status: SpanStatus = { code: SpanStatusCode.UNSET } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's check the repository structure and locate the files
git ls-files | grep -E "(api.middleware-test|otel.ts)" | head -20Repository: TanStack/ai
Length of output: 202
🏁 Script executed:
# Examine the test file in question
cat -n testing/e2e/src/routes/api.middleware-test.ts | sed -n '70,110p'Repository: TanStack/ai
Length of output: 1787
🏁 Script executed:
# Look at the full startSpan implementation to understand context usage
cat -n testing/e2e/src/routes/api.middleware-test.ts | grep -A 30 "startSpan"Repository: TanStack/ai
Length of output: 2286
🏁 Script executed:
# Check the otelMiddleware implementation at the referenced lines
cat -n packages/typescript/ai/src/middlewares/otel.ts | sed -n '330,370p'Repository: TanStack/ai
Length of output: 1955
🏁 Script executed:
# Check recordOtelSpan function signature
cat -n testing/e2e/src/routes/api.middleware-test.ts | grep -B 5 -A 20 "function recordOtelSpan"Repository: TanStack/ai
Length of output: 37
🏁 Script executed:
# Look at the beginning of the file for imports and types
cat -n testing/e2e/src/routes/api.middleware-test.ts | head -80Repository: TanStack/ai
Length of output: 2869
🏁 Script executed:
# Find the otel-capture file
fd -name "otel-capture*" testing/Repository: TanStack/ai
Length of output: 286
🏁 Script executed:
# Check the recordOtelSpan function signature and type
cat -n testing/e2e/src/lib/otel-capture.ts | head -100Repository: TanStack/ai
Length of output: 3287
Capture parent span context to enable hierarchy testing.
The fake tracer accepts a parent Context parameter in startSpan() (line 81) but discards it with the underscore prefix. The recordOtelSpan() call has no way to store parent linkage since CapturedSpan lacks a parent span ID field. This means tests written to verify span nesting cannot detect regressions, even though otelMiddleware (otel.ts:362) explicitly passes parent context to enable proper hierarchy.
Add a parentSpanId?: string field to the CapturedSpan interface and extract it from the _ctx parameter when calling recordOtelSpan().
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@testing/e2e/src/routes/api.middleware-test.ts` around lines 81 - 97, The fake
tracer's startSpan currently ignores the passed parent Context (_ctx) and
recordOtelSpan/CapturedSpan have no parent field, so span hierarchy can't be
tested; add an optional parentSpanId?: string to the CapturedSpan interface,
extract the parent span ID from the incoming _ctx inside startSpan (use the
OpenTelemetry context/span helpers to get the active span and its
spanContext().spanId), and include that parentSpanId when calling recordOtelSpan
so tests can observe span nesting; update any consumers/mocks expecting
CapturedSpan accordingly.
| # Versions are pinned to specific tags; bump intentionally when you re-run | ||
| # the verification matrix so the screenshots in the docs match the version | ||
| # users will actually deploy. |
There was a problem hiding this comment.
Pinning policy in comments and image tags are inconsistent.
The file says versions are pinned, but several services use floating tags (16-alpine, 3, latest). That weakens reproducibility of your backend verification matrix.
Suggested update
-# Versions are pinned to specific tags; bump intentionally when you re-run
+# Images should use immutable tags/digests; bump intentionally when you re-run
# the verification matrix so the screenshots in the docs match the version
# users will actually deploy.- image: postgres:16-alpine
+ image: postgres:16.4-alpine
...
- image: langfuse/langfuse:3
+ image: langfuse/langfuse:3.78.0
...
- image: helicone/helicone:latest
+ image: helicone/helicone:2026.04.0Also applies to: 46-47, 57-57, 77-77
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@testing/otel-verify/docker-compose.yml` around lines 10 - 12, Comments claim
images are pinned but several services use floating tags (e.g., image tags
"16-alpine", "3", "latest"); change those floating tags in docker-compose.yml to
explicit, immutable tags (replace "16-alpine", "3", "latest" with concrete
versioned tags like "16.5-alpine", "3.12.8", "1.2.3" or whatever exact upstream
tag you tested), update the top comment to reflect the exact pinned-tag policy,
and ensure the same fixes are applied at the other occurrences called out in the
review (around the other image lines mentioned).
| datadog: { | ||
| name: 'Datadog', | ||
| endpoint: () => | ||
| `https://trace.agent.${process.env.DD_SITE ?? 'datadoghq.com'}/api/v0.2/traces`, | ||
| headers: () => ({ | ||
| 'DD-API-KEY': envOrThrow('DD_API_KEY'), | ||
| }), |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
For Datadog (as of 2026), what is the correct OTLP/HTTP traces ingestion endpoint and required headers for direct OpenTelemetry SDK export (no collector/agent), and does https://trace.agent.<DD_SITE>/api/v0.2/traces accept OTLP protobuf payloads?
💡 Result:
As of 2026, Datadog's direct OTLP/HTTP traces ingestion endpoint for OpenTelemetry SDK (agentless) is in Preview and requires allowlisting via your Customer Success Manager. The endpoint is region-specific (e.g., https://otlp-trace.{DD_SITE}/v1/traces, where {DD_SITE} is datadoghq.com, us3.datadoghq.com, etc.; exact value from region-param "otlp_trace_endpoint" in docs). Required headers: - dd-api-key: Your Datadog API key - dd-otlp-source: Organization-specific value provided by Datadog after allowlisting - Optional: dd-otel-span-mapping (JSON config for span mapping, e.g., {"span_name_as_resource_name": false}) Protocol: HTTP/protobuf only (no JSON or gRPC). Example env vars: OTEL_EXPORTER_OTLP_TRACES_PROTOCOL="http/protobuf" OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="https://otlp-trace.{DD_SITE}/v1/traces" OTEL_EXPORTER_OTLP_TRACES_HEADERS="dd-api-key=${DD_API_KEY},dd-otlp-source=${YOUR_SITE}" The legacy endpoint https://trace.agent.{DD_SITE}/api/v0.2/traces does NOT accept OTLP protobuf payloads. It is for Datadog's native trace format (v0.2/v0.3 msgpack), not OTLP. OTLP ingestion uses dedicated intake endpoints. For agent-mediated OTLP, use http(s)://agent:4318 (HTTP) or 4317 (gRPC), but query specifies direct SDK export (no collector/agent).
Citations:
- 1: https://github.com/DataDog/documentation/blob/94fc8314/content/en/opentelemetry/setup/otlp_ingest/traces.md
- 2: https://docs.datadoghq.com/opentelemetry/setup/otlp_ingest/
- 3: https://docs.datadoghq.com/opentelemetry/setup/otlp_ingest/metrics/
- 4: https://docs.datadoghq.com/opentelemetry/setup/intake_endpoint/
Fix Datadog OTLP endpoint and headers for OTLPTraceExporter compatibility.
The current endpoint https://trace.agent.{DD_SITE}/api/v0.2/traces does not accept OTLP protobuf payloads—it is Datadog's legacy format (v0.2/v0.3 msgpack). For direct OpenTelemetry SDK export, use the OTLP-specific endpoint https://otlp-trace.{DD_SITE}/v1/traces with required headers dd-api-key and dd-otlp-source (the latter must be provided by Datadog after allowlisting). The current code is missing dd-otlp-source and will fail with OTLPTraceExporter.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@testing/otel-verify/src/backends.ts` around lines 154 - 160, The Datadog
backend config (the datadog object: endpoint and headers functions) must use the
OTLP-compatible URL and headers; change endpoint() to use
"https://otlp-trace.{DD_SITE}/v1/traces" and update headers() to return
'dd-api-key' and 'dd-otlp-source' (use envOrThrow('DD_API_KEY') and
envOrThrow('DD_OTLP_SOURCE') to ensure both are provided). Ensure you keep using
process.env.DD_SITE fallback as before (e.g., 'datadoghq.com') when constructing
the new OTLP endpoint.
| * `SCENARIO=basic-text,error` to skip the tool round-trip while smoke-testing. | ||
| */ | ||
|
|
||
| import { trace, metrics } from '@opentelemetry/api' |
There was a problem hiding this comment.
Fix the named import order to keep eslint green.
Line 16 already trips sort-imports; reorder the specifiers so this file doesn't stay lint-red.
🧰 Tools
🪛 ESLint
[error] 16-16: Member 'metrics' of the import declaration should be sorted alphabetically.
(sort-imports)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@testing/otel-verify/src/index.ts` at line 16, The import specifiers for the
OpenTelemetry API are out of order and trigger eslint's sort-imports rule;
reorder the named imports in the import statement so the specifiers are
alphabetically sorted (e.g., place 'metrics' before 'trace') in the import from
'@opentelemetry/api' to satisfy linting.
| const sdk = new NodeSDK({ | ||
| resource: new Resource({ | ||
| [ATTR_SERVICE_NAME]: SERVICE_NAME, | ||
| [ATTR_SERVICE_VERSION]: SERVICE_VERSION, | ||
| // Resource-level marker so backends with multi-service UIs (Datadog, | ||
| // Sentry) bucket these traces under a recognisable name. | ||
| 'deployment.environment': 'verify', | ||
| }), | ||
| traceExporter: new OTLPTraceExporter({ | ||
| url: backend.endpoint, | ||
| headers: backend.headers, | ||
| }), | ||
| }) | ||
|
|
||
| sdk.start() | ||
|
|
||
| const stopAimock = await startAimock() | ||
| const tracer = trace.getTracer(TRACER_NAME) | ||
| // Backends like Phoenix/Jaeger don't ingest metrics over OTLP/HTTP traces | ||
| // endpoint — pass the no-op meter so the middleware stays happy without | ||
| // doubling the network surface area. The middleware's own histograms remain | ||
| // exercised, just not exported to the backend. | ||
| const meter = metrics.getMeter(TRACER_NAME) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the full context of the file to understand the SDK setup
cat -n testing/otel-verify/src/index.tsRepository: TanStack/ai
Length of output: 5252
🏁 Script executed:
# Check if there's a README describing the tool's scope
find . -name "README*" -type f | head -20Repository: TanStack/ai
Length of output: 808
🏁 Script executed:
# Search for metrics exporter or metrics reader configuration
rg -i "metricsreader|metricexporter|metrics.*exporter" testing/otel-verify/Repository: TanStack/ai
Length of output: 37
🏁 Script executed:
# Look for README in the otel-verify directory
find testing/otel-verify -name "README*" -type fRepository: TanStack/ai
Length of output: 85
🏁 Script executed:
# Check the full testing/otel-verify directory structure
ls -la testing/otel-verify/Repository: TanStack/ai
Length of output: 472
🏁 Script executed:
# Search for any metrics exporter imports or configs in the entire testing/otel-verify directory
rg -i "metricreader|metricexporter" testing/otel-verify/ --type tsRepository: TanStack/ai
Length of output: 37
🏁 Script executed:
# Check NodeSDK constructor usage - look at the instantiation and see all parameters
ast-grep --pattern 'new NodeSDK($_)'Repository: TanStack/ai
Length of output: 975
🏁 Script executed:
# Read the otel-verify README to understand stated scope
cat -n testing/otel-verify/README.mdRepository: TanStack/ai
Length of output: 6330
🏁 Script executed:
# Check the scenarios to understand what metrics are being used
cat -n testing/otel-verify/src/scenarios.ts | head -100Repository: TanStack/ai
Length of output: 3846
Update README to clarify metrics scope, or add metrics exporter.
The README states the harness "exports spans + metrics over OTLP/HTTP," but the SDK config wires only a trace exporter. The no-op meter passed to middleware is intentional per the code comment (lines 83–86: "Backends like Phoenix/Jaeger don't ingest metrics over OTLP/HTTP traces endpoint"). However, the README verification checklist (lines 64–73) makes no mention of metrics, and the final output says "traces sent" — this misleads users about the harness's actual capability. Either remove "metrics" from the README overview and adjust expectations, or add a real OTLP metrics pipeline (e.g., separate metrics exporter to a metrics-capable endpoint).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@testing/otel-verify/src/index.ts` around lines 65 - 87, The README claims the
harness "exports spans + metrics over OTLP/HTTP" but the runtime only wires a
trace exporter (NodeSDK with OTLPTraceExporter) and uses a no-op meter
(metrics.getMeter) for middleware; either update documentation to remove/clarify
metrics export from the overview and verification checklist and change the final
output text to "traces sent" only, or implement a proper metrics pipeline by
creating and wiring a metrics exporter (e.g., OTLP metrics exporter /
MeterProvider) and configuring the SDK (or a separate MeterProvider) so the
middleware receives a real meter instead of the no-op meter; locate NodeSDK,
OTLPTraceExporter, meter (metrics.getMeter) and the README verification
checklist to apply the chosen fix.
| for (const scenario of scenarios) { | ||
| process.stdout.write(`▶ ${scenario.id}: ${scenario.label}... `) | ||
| const t0 = Date.now() | ||
| try { | ||
| await scenario.run(tracer, meter) | ||
| console.log(`ok (${Date.now() - t0}ms)`) | ||
| } catch (err) { | ||
| // Scenario `error` is expected to throw; others shouldn't. Surface | ||
| // the failure but don't abort the run — partial trace data is still | ||
| // useful for diagnosing other backends. | ||
| console.log(`failed: ${(err as Error).message}`) | ||
| if (scenario.id !== 'error') exitCode = 1 | ||
| } |
There was a problem hiding this comment.
Fail the run when the expected-error scenario unexpectedly succeeds.
If scenario.id === 'error' stops throwing, this loop still prints ok and exits 0. That turns the error-path smoke check into a false pass.
Suggested fix
try {
await scenario.run(tracer, meter)
- console.log(`ok (${Date.now() - t0}ms)`)
+ if (scenario.id === 'error') {
+ console.log(`failed: expected ${scenario.id} to throw`)
+ exitCode = 1
+ } else {
+ console.log(`ok (${Date.now() - t0}ms)`)
+ }
} catch (err) {
- // Scenario `error` is expected to throw; others shouldn't. Surface
- // the failure but don't abort the run — partial trace data is still
- // useful for diagnosing other backends.
- console.log(`failed: ${(err as Error).message}`)
- if (scenario.id !== 'error') exitCode = 1
+ if (scenario.id === 'error') {
+ console.log(`ok (${Date.now() - t0}ms, expected failure)`)
+ } else {
+ console.log(`failed: ${(err as Error).message}`)
+ exitCode = 1
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@testing/otel-verify/src/index.ts` around lines 91 - 103, The loop currently
treats every successful run as "ok" even for the special scenario with id ===
'error' which is expected to throw; update the logic in the for-loop over
scenarios (the code that calls scenario.run(tracer, meter) and writes "ok") so
that when scenario.id === 'error' and the call does not throw you mark it as a
failure: print a clear failure message and set exitCode = 1 (same as in the
catch block), otherwise keep printing "ok" for true successes; adjust only the
success path around scenario.run to check scenario.id and handle the
expected-error case.
| try { | ||
| const stream = chat({ | ||
| adapter: makeAdapter(), | ||
| messages: [{ role: 'user', content: '[error] run test' }], | ||
| middleware: [makeOtelMiddleware(tracer, meter), explode], | ||
| agentLoopStrategy: maxIterations(1), | ||
| }) | ||
| await drain(stream) | ||
| } catch { | ||
| // The whole point of this scenario is to land an error span — swallow | ||
| // the rethrow so the harness continues to the next scenario. | ||
| } |
There was a problem hiding this comment.
Don't swallow unexpected failures in the error scenario.
This catch {} turns setup/transport/provider failures into a "passing" error run, so the harness can look healthy even when it never hit the intended synthetic error. Catch only the expected synthetic verify-otel error and rethrow anything else.
Suggested fix
- } catch {
+ } catch (error) {
+ if (
+ !(error instanceof Error) ||
+ error.message !== 'synthetic verify-otel error'
+ ) {
+ throw error
+ }
// The whole point of this scenario is to land an error span — swallow
// the rethrow so the harness continues to the next scenario.
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try { | |
| const stream = chat({ | |
| adapter: makeAdapter(), | |
| messages: [{ role: 'user', content: '[error] run test' }], | |
| middleware: [makeOtelMiddleware(tracer, meter), explode], | |
| agentLoopStrategy: maxIterations(1), | |
| }) | |
| await drain(stream) | |
| } catch { | |
| // The whole point of this scenario is to land an error span — swallow | |
| // the rethrow so the harness continues to the next scenario. | |
| } | |
| try { | |
| const stream = chat({ | |
| adapter: makeAdapter(), | |
| messages: [{ role: 'user', content: '[error] run test' }], | |
| middleware: [makeOtelMiddleware(tracer, meter), explode], | |
| agentLoopStrategy: maxIterations(1), | |
| }) | |
| await drain(stream) | |
| } catch (error) { | |
| if ( | |
| !(error instanceof Error) || | |
| error.message !== 'synthetic verify-otel error' | |
| ) { | |
| throw error | |
| } | |
| // The whole point of this scenario is to land an error span — swallow | |
| // the rethrow so the harness continues to the next scenario. | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@testing/otel-verify/src/scenarios.ts` around lines 131 - 142, The empty catch
swallows all failures; change it to only swallow the expected synthetic error by
catching the thrown error object (from the chat() stream/drain flow using
makeAdapter(), makeOtelMiddleware(tracer, meter), explode) and if its message
(or a distinguishing property) does not include or equal "synthetic verify-otel
error" rethrow it; otherwise return/ignore to allow the harness to continue.
Ensure this check is applied in the catch block that surrounds the
chat(...)/await drain(stream) sequence so transport/setup/provider exceptions
are propagated.
Phoenix's OTLP/HTTP endpoint rejects application/json with HTTP 415
("Unsupported content type"). Switch to
@opentelemetry/exporter-trace-otlp-proto so the harness sends
application/x-protobuf, which is the canonical OTLP-over-HTTP encoding
and is accepted by every backend preset in backends.ts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@testing/otel-verify/src/index.ts`:
- Around line 82-84: The call to sdk.start() can leak the BatchSpanProcessor if
startAimock() throws because startAimock() is awaited outside the try/finally
that calls sdk.shutdown(); fix this by ensuring startAimock() is awaited inside
the same try that guarantees sdk.shutdown() runs (i.e., call sdk.start(), then
enter try { await startAimock(); ... } finally { await sdk.shutdown(); }) so
sdk.shutdown() always executes even on startAimock() rejection; adjust
surrounding control flow where sdk.start(), startAimock(), and sdk.shutdown()
are referenced to preserve existing behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f23b56f8-bbf4-43c4-873d-f5b6bdd92ce2
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (2)
testing/otel-verify/package.jsontesting/otel-verify/src/index.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- testing/otel-verify/package.json
| sdk.start() | ||
|
|
||
| const stopAimock = await startAimock() |
There was a problem hiding this comment.
sdk.shutdown() is skipped if startAimock() throws.
sdk.start() is called at line 82, but startAimock() is awaited at line 84 — outside the try/finally that calls sdk.shutdown(). If startAimock() rejects, the SDK's BatchSpanProcessor is abandoned without flushing (no spans were recorded yet, but the exporter's connection/timer resources are still leaked and process.exit may race the internal batch timer).
🛠️ Proposed fix
sdk.start()
- const stopAimock = await startAimock()
const tracer = trace.getTracer(TRACER_NAME)
const meter = metrics.getMeter(TRACER_NAME)
+ let stopAimock: Awaited<ReturnType<typeof startAimock>> | undefined
let exitCode = 0
try {
+ stopAimock = await startAimock()
for (const scenario of scenarios) {
process.stdout.write(`▶ ${scenario.id}: ${scenario.label}... `)
const t0 = Date.now()
try {
await scenario.run(tracer, meter)
console.log(`ok (${Date.now() - t0}ms)`)
} catch (err) {
console.log(`failed: ${(err as Error).message}`)
if (scenario.id !== 'error') exitCode = 1
}
}
} finally {
- await stopAimock()
+ await stopAimock?.()
await sdk.shutdown()
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@testing/otel-verify/src/index.ts` around lines 82 - 84, The call to
sdk.start() can leak the BatchSpanProcessor if startAimock() throws because
startAimock() is awaited outside the try/finally that calls sdk.shutdown(); fix
this by ensuring startAimock() is awaited inside the same try that guarantees
sdk.shutdown() runs (i.e., call sdk.start(), then enter try { await
startAimock(); ... } finally { await sdk.shutdown(); }) so sdk.shutdown() always
executes even on startAimock() rejection; adjust surrounding control flow where
sdk.start(), startAimock(), and sdk.shutdown() are referenced to preserve
existing behavior.
Adds langfuse.observation.input/output to iteration and tool spans, plus langfuse.trace.input/output on the root span so the Langfuse trace card and chat-level observation populate Input/Output panels. The verify harness now also accepts LANGFUSE_BASE_URL (matches the Langfuse JS SDK env var) and the langfuse-cloud preset's notes call out region mismatch as the typical cause of a 401 on this endpoint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
testing/otel-verify/src/backends.ts (1)
167-173:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winDatadog still points at the legacy traces intake path.
This OTLP/HTTP exporter needs the Datadog OTLP endpoint and required OTLP headers; the current
trace.agent.* /api/v0.2/tracesURL will not accept the payloads this harness emits. This is still the same integration bug previously flagged.🔧 Suggested fix
datadog: { name: 'Datadog', endpoint: () => - `https://trace.agent.${process.env.DD_SITE ?? 'datadoghq.com'}/api/v0.2/traces`, + `https://otlp-trace.${process.env.DD_SITE ?? 'datadoghq.com'}/v1/traces`, headers: () => ({ - 'DD-API-KEY': envOrThrow('DD_API_KEY'), + 'dd-api-key': envOrThrow('DD_API_KEY'), + 'dd-otlp-source': envOrThrow('DD_OTLP_SOURCE'), }),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@testing/otel-verify/src/backends.ts` around lines 167 - 173, The datadog backend currently points at the legacy trace intake (the datadog "trace.agent.../api/v0.2/traces") which won't accept OTLP/HTTP payloads; update the datadog object by changing the endpoint() to Datadog's OTLP HTTP ingest path (use the OTLP base host for your DD_SITE, e.g. the OTLP domain with a /v1/traces or OTLP v0/v1 ingestion path as required) and adjust headers() to include the OTLP-required headers (keep envOrThrow('DD_API_KEY') for the API key but ensure it's sent as the Datadog API key header expected by OTLP ingestion and add the appropriate Content-Type like application/json or the OTLP content type your exporter uses); modify the endpoint() and headers() functions in the datadog object to reflect these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@testing/otel-verify/src/backends.ts`:
- Around line 167-173: The datadog backend currently points at the legacy trace
intake (the datadog "trace.agent.../api/v0.2/traces") which won't accept
OTLP/HTTP payloads; update the datadog object by changing the endpoint() to
Datadog's OTLP HTTP ingest path (use the OTLP base host for your DD_SITE, e.g.
the OTLP domain with a /v1/traces or OTLP v0/v1 ingestion path as required) and
adjust headers() to include the OTLP-required headers (keep
envOrThrow('DD_API_KEY') for the API key but ensure it's sent as the Datadog API
key header expected by OTLP ingestion and add the appropriate Content-Type like
application/json or the OTLP content type your exporter uses); modify the
endpoint() and headers() functions in the datadog object to reflect these
changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fd672296-bda9-4934-ad6f-d01cbfbf7e05
📒 Files selected for processing (2)
packages/typescript/ai/src/middlewares/otel.tstesting/otel-verify/src/backends.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/typescript/ai/src/middlewares/otel.ts
Resolves sherif multiple-dependency-versions error introduced when otel-verify was added with ^4.20.6 against a workspace standardized on ^4.21.0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
testing/otel-verify/package: a manual smoke harness that boots in-process aimock and runs three chat scenarios (basic-text, with-tool, error) throughotelMiddlewareOTEL_BACKENDenv vardocker-compose.ymlships local services for the self-hostable backends; SaaS backends use env vars (API keys / DSN)private: true— never published, not part of CITest plan
pnpm build:allsucceeds (Nx topo order)pnpm --filter @tanstack/ai-otel-verify exec tsc --noEmitcleanOTEL_BACKEND=jaeger pnpm verifyagainstdocker compose up jaeger→ three scenarios produce expected spans in the Jaeger UI🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Tests
Chores