Conversation
There was a problem hiding this comment.
LGTM — clean, well-tested telemetry enrichment; the duplicate constant nit is already noted inline.
Extended reasoning...
Overview
This PR enriches MCP OpenTelemetry spans with standardized RPC attributes (rpc.system, rpc.method, rpc.service, mcp.resource.uri, mcp.session.id). Changes span six files: a new shared _otel.py helper, updates to session.py and server/lowlevel/server.py to use those helpers, a thin _SessionAwareWriteStream wrapper in the client transport, and two test files covering the new attributes.
Security risks
None. This is purely observability plumbing — no auth, crypto, or permission logic is touched. The session ID is read from existing HTTP request headers and placed into OTel span attributes, which is standard practice.
Level of scrutiny
Low-to-medium. The changes are additive and do not alter any existing request-handling, error-handling, or state-management logic. The refactor in _handle_request (moving metadata extraction before the span context manager) is a straightforward reordering with no behavioral difference. The _SessionAwareWriteStream wrapper correctly delegates all protocol methods to the inner stream.
Other factors
Tests cover the new rpc.* and mcp.resource.uri attributes for both in-process and streamable-HTTP transports. The one nit — a duplicate MCP_SESSION_ID_HEADER constant in server/lowlevel/server.py when the identical value already exists in streamable_http.py — is a minor maintenance concern flagged as an inline comment, not a functional issue.
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
-
🟡
src/mcp/shared/_otel.py:58-84— Forcompletion/completerequests where the completion references a resource template,mcp.resource.uriis never set because the URI is nested atparams.ref.uri, notparams.uri. Bothbuild_server_span_attributes(usinggetattr(params, 'uri', None)) andbuild_client_span_attributes(usingparams.get('uri')) silently returnNoneforCompleteRequestParams, leaving resource completion spans without URI enrichment. The fix is to add a fallback: on the server side,getattr(getattr(params, 'ref', None), 'uri', None); on the client side, checkparams.get('ref', {}).get('uri')when the top-levelurikey is absent.Extended reasoning...
What the bug is and how it manifests
The PR adds
mcp.resource.urienrichment to OTel spans for MCP requests that deal with resources. This works correctly forresources/read,resources/subscribe, andresources/unsubscribebecause those params types (ReadResourceRequestParams,SubscribeRequestParams,UnsubscribeRequestParams) all expose a top-levelurifield. However, thecompletion/completemethod also references a resource URI when the completion target is a resource template — and that case is silently missed.The specific code path that triggers it
In
build_server_span_attributes(src/mcp/shared/_otel.py, line 80):resource_uri = getattr(params, 'uri', None)
For a
completion/completerequest,paramsis aCompleteRequestParamsobject. Itsreffield is typedResourceTemplateReference | PromptReference. When the caller is completing a resource URI template,refis aResourceTemplateReferencewhoseuri: strfield holds the template URI.CompleteRequestParamshas no top-leveluriattribute, sogetattr(params, 'uri', None)returnsNoneandmcp.resource.uriis never added to the span attributes.On the client side (
src/mcp/shared/_otel.py, line 55),build_client_span_attributesreceives the serialized params dict and does:if params is not None and (resource_uri := params.get('uri')) is not None:
The serialized
CompleteRequestParamsdict contains{'ref': {'uri': '...', 'type': 'ref/resource'}, 'argument': {...}}. There is no top-level'uri'key, soparams.get('uri')also returnsNone.Why existing code doesn't prevent it
Python's
getattranddict.getsilently returnNonefor missing attributes/keys. There are no type guards checking the method name before the URI lookup, no test forcompletion/completespans, and the PR description frames this as "where available" — giving the impression the omission is intentional, when in fact the URI is available, just one level deeper.What the impact would be
Any observability pipeline that relies on
mcp.resource.urito understand which resource template a completion request is targeting will find that attribute absent for allcompletion/completespans that reference resource templates. This makes it impossible to correlate completion activity with resource usage via spans alone.How to fix it
Server side — add a fallback after the existing lookup:
resource_uri = getattr(params, 'uri', None) if resource_uri is None: ref = getattr(params, 'ref', None) resource_uri = getattr(ref, 'uri', None)
Client side — fall back to the nested
ref.uriin the serialized dict:resource_uri = params.get('uri') if resource_uri is None: ref = params.get('ref') or {} resource_uri = ref.get('uri')
Step-by-step proof
- A client calls
client.complete(ResourceTemplateReference(uri='file:///src/{name}.py'), ...). send_requestserializes params to{'ref': {'type': 'ref/resource', 'uri': 'file:///src/{name}.py'}, 'argument': {...}}.build_client_span_attributesreceives this dict asparamsand evaluatesparams.get('uri')→None.mcp.resource.uriis not set on the client span.- The server deserializes the message into a
CompleteRequestParamsobject where.refis aResourceTemplateReference(uri='file:///src/{name}.py')and there is no.uriattribute on the top-level object. build_server_span_attributescallsgetattr(params, 'uri', None)→None.mcp.resource.uriis not set on the server span either.- Both spans are emitted with no
mcp.resource.uriattribute despite the URI being present in the request.
- A client calls
-
🟡
src/mcp/shared/_otel.py:38-55— Client spans produced bybuild_client_span_attributesomitrpc.service, while the server-side counterpart always includes it; per OpenTelemetry RPC semantic conventionsrpc.serviceis Recommended on client spans to identify the remote service. Without it, client spans cannot be filtered or grouped by target service in observability tooling. The fix is straightforward: add an optionalservice_name: str | None = Noneparameter tobuild_client_span_attributesand emitrpc.servicewhen provided;ClientSession(which has access toinitialize_result.server_info.namepost-handshake) can then pass the value in.Extended reasoning...
What the bug is and how it manifests
build_client_span_attributesinsrc/mcp/shared/_otel.py(lines 38–55) builds the attribute dict for every client request span. It setsrpc.system,rpc.method,mcp.method.name, andjsonrpc.request.id, but has noservice_nameparameter and therefore never emitsrpc.service. In contrast,build_server_span_attributes(lines 57–84) always includes"rpc.service": service_name. The OTel RPC semantic conventions (https://opentelemetry.io/docs/specs/semconv/rpc/rpc-spans/) markrpc.serviceas Recommended for both client and server spans because it identifies the remote service being called.The specific code path that triggers it
BaseSession.send_requestinsrc/mcp/shared/session.py(lines 276–282) callsbuild_client_span_attributes(method=..., request_id=..., params=...). There is noservice_nameargument available at that layer becauseBaseSessionis transport-agnostic and does not hold server identity state. As a result, every MCP client span is emitted withoutrpc.service.Why existing code doesn't prevent it
The refuting verifier correctly identifies the architectural constraint:
BaseSessionhas no access toserver_info.name, which only becomes available inClientSession._initialize_result.server_info.nameafter theinitializehandshake. This is a real constraint, but it does not make the omission intentional in a final-design sense — it is an addressable gap. An optionalservice_name: str | None = Noneparameter added tobuild_client_span_attributeswould be fully backward-compatible;BaseSessioncontinues to pass nothing, whileClientSessioncan overridesend_requestor supply the value through another mechanism after initialization.Impact
The functional behavior of the SDK is unaffected — spans are created and trace context propagates correctly. The gap is purely in observability quality: client spans cannot be filtered, grouped, or aggregated by remote service name in any APM or log-analytics tool that relies on
rpc.service. Given that the PR's stated purpose is to add richer OTel attributes, this asymmetry is a meaningful gap relative to its own goals.How to fix it
- Add
service_name: str | None = Nonetobuild_client_span_attributes. - Inside the function, emit
attributes["rpc.service"] = service_namewhenservice_name is not None. - In
ClientSession, afterinitializecompletes, storeserver_info.nameand pass it when callingbuild_client_span_attributes(e.g., by overridingsend_requestor injecting it via a stored instance variable).
Step-by-step proof
- Open
src/mcp/shared/_otel.py:build_client_span_attributes(line 38) has signature(*, method, request_id, params)— noservice_name. - Open
src/mcp/shared/session.py, line 279:build_client_span_attributes(method=request.method, request_id=request_id, params=request_data.get("params"))—rpc.serviceis therefore absent from every emitted client span. - Open
src/mcp/shared/_otel.py, line 63:build_server_span_attributesrequiresservice_nameand always sets"rpc.service": service_name. - Check
tests/shared/test_otel.py(lines 40–47): the test assertsclient_span["attributes"]["rpc.system"] == "mcp"andrpc.method, but has no assertion forclient_span["attributes"]["rpc.service"], confirming the attribute is absent and untested on the client side.
- Add
| def build_client_span_attributes( | ||
| *, | ||
| method: str, | ||
| request_id: str | int, | ||
| params: dict[str, Any] | None = None, | ||
| ) -> dict[str, Any]: | ||
| """Build OTel attributes for an MCP client request span.""" | ||
| attributes: dict[str, Any] = { | ||
| "rpc.system": MCP_RPC_SYSTEM, | ||
| "rpc.method": method, | ||
| "mcp.method.name": method, | ||
| "jsonrpc.request.id": request_id, | ||
| } | ||
|
|
||
| resource_uri = None | ||
| if params is not None: | ||
| resource_uri = params.get("uri") | ||
| if resource_uri is None: |
There was a problem hiding this comment.
🔴 build_client_span_attributes has no session_id parameter, so client spans can never carry the mcp.session.id attribute while server spans for the same session can — creating an observability asymmetry. The PR description's claim that 'mcp.session.id is surfaced on the client side through the streamable HTTP write stream wrapper' is factually incorrect: commit 67ca7fe in this same branch explicitly removed that wrapper without adding a replacement. To fix this, add a session_id parameter to build_client_span_attributes and thread the session ID from ClientStreamableHTTPTransport.session_id through BaseSession.send_request.
Extended reasoning...
What the bug is and how it manifests
build_client_span_attributes (src/mcp/shared/_otel.py lines 40-57) accepts only method, request_id, and params — there is no session_id parameter. This means that no matter what the calling code does, client OTel spans can never carry the mcp.session.id attribute. By contrast, build_server_span_attributes (lines 60-84) accepts an explicit session_id parameter and conditionally sets attributes["mcp.session.id"] when it is non-None. The two functions are structurally asymmetric for session ID support.
The specific code path that triggers it
The streamable HTTP client transport (src/mcp/client/streamable_http.py, ClientStreamableHTTPTransport) maintains self.session_id and includes it in every outgoing request via the mcp-session-id HTTP header. However, this value is never passed to BaseSession.send_request (src/mcp/shared/session.py:276), which calls build_client_span_attributes with only method, request_id, and the request params dict. There is no pathway to supply a session_id to the client span builder.
Why existing code does not prevent it
Commit 67ca7fe ('Remove client OTel session-id wrapper') in this same PR branch deliberately removed the _SessionAwareWriteStream wrapper class and the _get_transport_session_id() mechanism that previously solved this problem. Commit 2c4b51a ('Restore OTel span attribute helpers') then re-added build_client_span_attributes without the session_id parameter. The TODO comment left by the author in client/streamable_http.py acknowledges the removal was intentional ('I have dropped the get_session_id callback because it breaks the Transport protocol') but offers no replacement. The PR description was not updated to reflect this change, and instead continues to claim the old mechanism still works.
What the impact would be
In distributed tracing across real deployments, client and server processes are separate. An operator querying their tracing backend (e.g. Jaeger, Honeycomb) by mcp.session.id to correlate all activity for a given session would find server spans but miss every client span, making the trace incomplete and the session timeline unreconstructable. This is the core use case for session ID on spans.
How to fix it
Add a session_id: str | None = None parameter to build_client_span_attributes with the same conditional-set logic as build_server_span_attributes. Then thread the session ID from the transport into BaseSession.send_request — either by passing it as a parameter to send_request, or by making BaseSession aware of the transport's session ID via a callback or property on the write stream (the original approach, now removed).
Step-by-step proof
- src/mcp/shared/_otel.py lines 40-57: build_client_span_attributes signature is (*, method: str, request_id: str | int, params: dict[str, Any] | None = None) — no session_id parameter.
- src/mcp/shared/_otel.py lines 60-84: build_server_span_attributes has session_id: str | None = None and sets attributes["mcp.session.id"] = session_id when non-None.
- src/mcp/shared/session.py line 276: send_request calls build_client_span_attributes(method=..., request_id=..., params=...) with no session_id.
- Commit 67ca7fe removed _SessionAwareWriteStream (the previous client-side session ID mechanism). Commit 2c4b51a restored build_client_span_attributes without restoring the session_id parameter.
- The PR description's Testing section references 'resource_read_spans_include_session_id or streamable_http_client_resource_read', but no such test exists in tests/shared/test_streamable_http.py — confirming the client-side session ID surfacing was never validated after the wrapper removal.
Summary
Testing
Notes