OpenTelemetry: HTTP server request metrics#757
OpenTelemetry: HTTP server request metrics#757dahlia wants to merge 5 commits intofedify-dev:mainfrom
Conversation
Sampled traces alone cannot reliably answer aggregate operational
questions like request rate, p95 latency, or status-code error
rate. Add two always-on instruments around Federation.fetch():
- fedify.http.server.request.count (Counter, {request})
- fedify.http.server.request.duration (Histogram, ms)
Both carry low-cardinality attributes: http.request.method,
fedify.endpoint, optional http.response.status_code, and optional
fedify.route.template. Raw URL paths, identifier values, query
strings, and full inbox URLs are deliberately excluded so that
operators can rely on the metrics even when traces are sampled.
The fedify.endpoint attribute is drawn from a fixed enumeration
covering the routes Fedify dispatches (webfinger, nodeinfo, actor,
inbox, shared_inbox, outbox, object, the built-in collections,
generic collection for user-defined dispatchers, plus
not_found, not_acceptable, and error). When a handler throws after
the route was already classified, the metric retains the matched
endpoint; error is reserved for the pre-classification failure
mode so fault-attribution stays per endpoint.
http.request.method is normalized to the OpenTelemetry-standard set
(CONNECT, DELETE, GET, HEAD, OPTIONS, PATCH, POST, PUT, TRACE) and
falls back to _OTHER for any other value, so an arbitrary client
cannot inflate metric cardinality with custom methods.
The metric is implemented on the existing FederationMetrics class
in metrics.ts and wired into the existing fetch()/#fetch() pair
via a small mutable HttpMetricState carrier. Tests cover the
acceptance criteria from the issue (success, 404, 406, thrown
error) plus shared-inbox routing, route-template emission, the
endpoint enum extensions, the global-MeterProvider fallback, and
the method-normalization regression.
fedify-dev#316
fedify-dev#736
Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.5
|
@codex review |
|
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR adds OpenTelemetry HTTP server request metrics to Fedify's Federation.fetch(). It introduces a counter and histogram, records method/endpoint (with normalized method), optional status code and route template, classifies endpoints from routes, and updates tests, docs, and the changelog. ChangesHTTP Server Request Metrics for Federation.fetch()
Sequence Diagram(s)sequenceDiagram
participant Client
participant fetch as Federation.fetch()
participant router as `#fetch`()
participant metrics as FederationMetrics.recordHttpServerRequest
Client->>fetch: HTTP request
fetch->>fetch: initialize metricState, capture start time
fetch->>router: route request (mutates metricState)
router-->>fetch: returns response or throws
fetch->>metrics: recordHttpServerRequest(metricState, status, duration)
metrics-->>fetch: metrics emitted
fetch-->>Client: response or error
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces OpenTelemetry HTTP server metrics for inbound requests handled by Federation.fetch(), adding fedify.http.server.request.count and fedify.http.server.request.duration instruments. These metrics track request volume and latency with attributes for normalized HTTP methods, endpoint categories, status codes, and URI route templates to ensure bounded cardinality. The changes include core metrics implementation, middleware integration, updated documentation, and new test cases. I have no feedback to provide as no review comments were submitted.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b46c62741b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/fedify/src/federation/metrics.ts`:
- Around line 60-66: The histogram created by meter.createHistogram for
instrument httpServerRequestDuration needs explicit bucket boundaries so P95 can
be accurately computed; update the call that creates
"fedify.http.server.request.duration" (the meter.createHistogram invocation that
assigns httpServerRequestDuration) to include an options.advice object with
explicitBucketBoundaries tuned to millisecond latencies (e.g. a sequence
covering low ms to high ms values around expected P95 — e.g.
[1,2,5,10,20,50,100,200,500,1000,2000]) so the SDK aggregates correctly; keep
description and unit as-is and only add the advice.explicitBucketBoundaries
array.
In `@packages/fedify/src/federation/middleware.ts`:
- Around line 1765-1768: Change HttpMetricState.endpoint from an unconstrained
string to a string-literal union that matches the bounded set of values produced
by getEndpointCategory (e.g. include "error", "not_found", "not_acceptable" and
whatever other categories getEndpointCategory returns), and update
getEndpointCategory's return type to a shared FedifyEndpoint union so both the
function and the interface use the same literal set; this ensures
HttpMetricState.endpoint is typed to the exact allowed values and will catch
mismatches at compile time.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9b203e40-737a-4842-a31b-ff72e00ce77a
📒 Files selected for processing (5)
CHANGES.mddocs/manual/opentelemetry.mdpackages/fedify/src/federation/metrics.tspackages/fedify/src/federation/middleware.test.tspackages/fedify/src/federation/middleware.ts
Codecov Report❌ Patch coverage is
... and 2 files with indirect coverage changes 🚀 New features to boost your workflow:
|
OpenTelemetry's HTTP semantic conventions list QUERY as part of the default known-method set for http.request.method. Without this, valid QUERY requests collapse into the _OTHER bucket and become indistinguishable from truly custom methods, skewing method-level request counts and latency. Add QUERY to KNOWN_HTTP_METHODS, document it in the normalized- method enumeration, and add a regression test asserting that a QUERY request is recorded with http.request.method=QUERY. fedify-dev#757 (comment) Assisted-by: Claude Code:claude-opus-4-7 Assisted-by: Codex:gpt-5.5
Without advice.explicitBucketBoundaries, the SDK falls back to its default boundary set, which doesn't align with the latency distribution Fedify expects from inbound HTTP requests and makes P95 less precise than it could be. Mirror the OpenTelemetry HTTP server semantic-conventions recommended buckets, converted from seconds to milliseconds to match the histogram's unit: 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000 The activitypub.delivery.duration and activitypub.inbox.processing_duration histograms are intentionally left untouched: they have different latency profiles from inbound HTTP requests and are out of scope for this PR. fedify-dev#757 (comment) Assisted-by: Claude Code:claude-opus-4-7 Assisted-by: Codex:gpt-5.5
HttpMetricState.endpoint and getEndpointCategory() previously used the unconstrained string type, so the type system gave no guidance when the metric attribute enum drifted from the documented set. Introduce a FedifyEndpoint string-literal union covering the 16 documented values (webfinger, nodeinfo, actor, inbox, shared_inbox, outbox, object, following, followers, liked, featured, featured_tags, collection, not_found, not_acceptable, error) and use it for both the metric-state field and the helper's return type. Future drift between the documented enum and the recorded attribute will now surface at compile time. fedify-dev#757 (comment) Assisted-by: Claude Code:claude-opus-4-7 Assisted-by: Codex:gpt-5.5
|
@codex review |
|
/gemini review |
|
Codex Review: Didn't find any major issues. 👍 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
Code Review
This pull request introduces OpenTelemetry HTTP server metrics for inbound requests handled by Federation.fetch(), specifically adding fedify.http.server.request.count and fedify.http.server.request.duration. These metrics track request methods, endpoint categories, status codes, and route templates while maintaining bounded cardinality through method normalization and fixed endpoint enumerations. The changes include the core implementation in the metrics and middleware modules, extensive test coverage, and updated documentation in the manual and changelog. I have no feedback to provide as no review comments were submitted.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/fedify/src/federation/middleware.test.ts`:
- Around line 1459-1496: Add a user-defined collection dispatcher in
createTestContext to exercise the collection collapse behavior and assert its
metadata maps to the stable "collection" endpoint: call
federation.setCollectionDispatcher("/users/{identifier}/likes", () => ({ items:
[] })) (or the project's collection-registration API) inside createTestContext
alongside the existing setActorDispatcher, setNodeInfoDispatcher,
setFollowersDispatcher and setInboxListeners, then add a test step that invokes
the dispatcher and verifies the resulting fedify.endpoint value equals
"collection".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 251ff727-d326-4879-936c-f02268288d36
📒 Files selected for processing (4)
docs/manual/opentelemetry.mdpackages/fedify/src/federation/metrics.tspackages/fedify/src/federation/middleware.test.tspackages/fedify/src/federation/middleware.ts
The PR documents that user-defined collection dispatchers
(setCollectionDispatcher) collapse to fedify.endpoint=collection
regardless of the dispatcher's name, but createTestContext() never
registered one, so a regression that emitted the raw dispatcher
name instead of "collection" would slip through.
Register a custom collection dispatcher at
/users/{identifier}/custom/{id} in createTestContext() and add a
test step asserting that a GET to that path records
fedify.endpoint=collection and the parameterized
fedify.route.template.
fedify-dev#757 (comment)
Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.5
|
@codex review |
|
/gemini review |
|
Codex Review: Didn't find any major issues. 👍 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
Code Review
This pull request introduces OpenTelemetry HTTP server metrics for inbound requests handled by Federation.fetch(). It adds fedify.http.server.request.count and fedify.http.server.request.duration metrics, updates the documentation, and includes comprehensive tests to verify the metric recording across various request scenarios. I have no feedback to provide.
This adds OpenTelemetry metrics for inbound HTTP requests handled by
Federation.fetch(), giving operators request counts and latency histograms that still work when tracing is sampled.Closes #736.
Changes
Added two metric instruments in @fedify/fedify:
fedify.http.server.request.count(Counter,{request})fedify.http.server.request.duration(Histogram,ms)Both instruments use bounded attributes:
http.request.method,fedify.endpoint, optionalhttp.response.status_code, and optionalfedify.route.template. Raw URL paths, identifier values, query strings, and full inbox URLs are deliberately excluded.fedify.endpointis a bounded enum:webfinger,nodeinfo,actor,inbox,shared_inbox,outbox,object,following,followers,liked,featured,featured_tags,collection,not_found,not_acceptable, anderror. User-defined collection dispatchers (collection:*,orderedCollection:*) collapse tocollection, since their names are application-defined.When a handler throws after route classification, the metric keeps the matched endpoint, such as
actor.erroris reserved for the case where classification itself fails, so per-endpoint fault attribution stays meaningful.http.request.methodis normalized to the OpenTelemetry-standard set (CONNECT,DELETE,GET,HEAD,OPTIONS,PATCH,POST,PUT,TRACE) with_OTHERfor anything else, so an arbitrary client cannot inflate metric cardinality with custom methods.Documented the new instruments, attributes, and endpoint enum in docs/manual/opentelemetry.md, and added
fedify.endpointandfedify.route.templateto the semantic-attribute reference table.Updated CHANGES.md.
Notes
The instruments are Fedify-prefixed (
fedify.http.server.*) rather than reusing the OpenTelemetry HTTP server semantic-convention name (http.server.request.*). Many apps already emithttp.server.*metrics for their own traffic. Reusing that name would mix Fedify endpoint labels with application routing labels on the same instrument.Issue #736 listed
actor_keyas a candidate endpoint value, but Fedify dispatches key pairs through callbacks rather than URL routes, so there is no separateactor_keyendpoint to classify.Verification
mise check && mise test:deno && mise test:nodemise check:md && mise docs:build