fix(openai): align reasoning and cache token attributes with OTel GenAI spec by dvirski · Pull Request #4130 · traceloop/openllmetry

dvirski · 2026-05-12T13:03:18Z

Problem:
gen_ai.usage.reasoning_tokens was the wrong attribute name — the OTel GenAI spec defines it as gen_ai.usage.reasoning.output_tokens. Similarly, the LLM_USAGE_CACHE_READ_INPUT_TOKENS constant resolved to gen_ai.usage.cache_read_input_tokens (underscore) instead of
the spec-correct gen_ai.usage.cache_read.input_tokens (dot). The openai-agents instrumentor also wasn't recording cache read or reasoning tokens at all. The opentelemetry-semantic-conventions floor was too old (0.60b1) to provide the official upstream cache token constants.

Fix:
Bumped opentelemetry-semantic-conventions to >=0.62b1 so cache token constants come from the upstream OTel package. Added GEN_AI_USAGE_REASONING_OUTPUT_TOKENS = "gen_ai.usage.reasoning.output_tokens" to semconv_ai and removed the old incorrect constant. Updated the
openai and openai-agents instrumentors to use the corrected constants. Added two new tests with recorded cassettes for cache read and reasoning token recording in openai-agents.

Summary by CodeRabbit

New Features
- Added tracking for cached input tokens in OpenAI responses.
- Added tracking for reasoning output tokens (o3-mini model support).
Tests
- Added test coverage for cache token usage recording.
- Added test coverage for reasoning token output recording.
Chores
- Updated semantic conventions for reasoning token attribute naming.
- Updated dependency requirements for semantic conventions compatibility.

…enAI spec

…anuel instrumentation

CLAassistant · 2026-05-12T13:03:49Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Dvir Rezenman seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai · 2026-05-12T13:11:20Z

📝 Walkthrough

Walkthrough

This PR enhances token tracking across OpenTelemetry instrumentation by renaming a semantic convention constant for clarity, adding cache-read and reasoning-token extraction to the OpenAI Agents instrumentor, and aligning attribute references across OpenAI and Traceloop SDK code.

Changes

Token Tracking & Semantic Convention Alignment

Layer / File(s)	Summary
Semantic convention constant rename for reasoning tokens `packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py`, `packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/_testing.py`	Renamed `GEN_AI_USAGE_REASONING_TOKENS` to `GEN_AI_USAGE_REASONING_OUTPUT_TOKENS` and updated the attribute string from `gen_ai.usage.reasoning_tokens` to `gen_ai.usage.reasoning.output_tokens` for semantic clarity.
OpenAI instrumentation migrated to new reasoning constant `packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py`, `packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py`, `packages/opentelemetry-instrumentation-openai/tests/traces/test_azure.py`, `packages/opentelemetry-instrumentation-openai/tests/traces/test_chat.py`	Updated span attribute keys in both chat and response wrappers to use the renamed constant, and updated test assertions accordingly.
OpenAI Agents cache and reasoning token extraction `packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py`, `packages/opentelemetry-instrumentation-openai-agents/pyproject.toml`, `packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py`, `packages/opentelemetry-instrumentation-openai-agents/tests/cassettes/test_openai_agents/*`	Extended `_extract_response_attributes` to extract `input_tokens_details.cached_tokens` and `output_tokens_details.reasoning_tokens` from response usage and emit corresponding span attributes. Updated dependency constraint and added two VCR-based test cases with cassettes validating cache-read and reasoning token recording.
Traceloop SDK cache token attribute alignment `packages/traceloop-sdk/traceloop/sdk/tracing/manual.py`, `packages/traceloop-sdk/tests/test_manual.py`	Updated `LLMSpan.report_usage` to reference `GenAIAttributes` cache token constants instead of the `SpanAttributes.LLM_USAGE_CACHE_*` keys, with corresponding test updates.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

📦 Cache and reasoning tokens, now in sight,
Semantic conventions aligned just right,
From Agents to SDK, the whole stack glows,
Token tracking flows where OpenAI goes. 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly summarizes the main change: aligning reasoning and cache token attributes with the OTel GenAI spec across OpenAI and OpenAI Agents instrumentations.
Linked Issues check	✅ Passed	The PR fully addresses issue `#4060` by implementing cache read token recording in openai-agents [`#4060`], updating attribute naming to spec-compliant constants in semconv_ai, and syncing related OpenAI instrumentor code.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to fixing attribute naming and adding cache/reasoning token recording as specified in issue `#4060`, with no unrelated modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch dr/feat(semconv)-align-reasoning-and-cache-token-attributes-with-OTel-GenAI-spec

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (1)

packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py (1)

554-557: ⚡ Quick win

Select the target span by attribute instead of positional index.

Using response_spans[0] is brittle when multiple openai.response spans are emitted; pick the span that actually has the asserted attribute.

More deterministic span selection

-    response_span = response_spans[0]
+    response_span = next(
+        (s for s in response_spans if GenAIAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS in s.attributes),
+        None,
+    )
+    assert response_span is not None

-    response_span = response_spans[0]
+    response_span = next(
+        (s for s in response_spans if SpanAttributes.GEN_AI_USAGE_REASONING_OUTPUT_TOKENS in s.attributes),
+        None,
+    )
+    assert response_span is not None

Also applies to: 574-577

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py`
around lines 554 - 557, The test currently picks a span by position
(response_spans[0]) which is brittle; replace that with a deterministic
selection that finds the span by the attribute you assert later (use
response_spans and the specific attribute key used in the assertions). For
example, replace uses of response_spans[0] in the test with code that selects
the span via next(s for s in response_spans if '<ATTRIBUTE_KEY>' in s.attributes
or s.attributes.get('<ATTRIBUTE_KEY>') == '<EXPECTED_VALUE>'), where
<ATTRIBUTE_KEY> is the attribute name asserted later in the test; do this for
both occurrences (around the response_spans usage at the two ranges noted) so
the test picks the span by attribute, not index.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@packages/opentelemetry-instrumentation-openai-agents/tests/cassettes/test_openai_agents/test_cache_read_input_tokens_recorded.yaml`:
- Around line 90-93: Remove all Cookie and set-cookie header values from the
cassette test_openai_agents/test_cache_read_input_tokens_recorded.yaml (and the
other occurrences noted) by redacting those header lines and re-recording the
cassette; ensure the test harness that generates cassettes filters or normalizes
"Cookie" request headers and "set-cookie" response headers before saving (update
the cassette recording/filtering logic to strip or replace these headers with a
stable placeholder) so future recordings do not include session cookies and
re-run the tests to commit the sanitized cassette.

In
`@packages/opentelemetry-instrumentation-openai-agents/tests/cassettes/test_openai_agents/test_reasoning_output_tokens_recorded.yaml`:
- Around line 16-17: The cassette
test_openai_agents/test_reasoning_output_tokens_recorded.yaml contains
sensitive/volatile headers (e.g., Cookie and request/infra identifiers) that
must be filtered from VCR recordings; update the test recording configuration
used for this suite to strip or replace Cookie and any request/infra ID headers
(e.g., CF identifiers, X-Request-Id) during capture, re-record the cassette, and
commit the regenerated YAML so the cassette no longer contains those sensitive
values.

In
`@packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py`:
- Around line 546-547: The test's cache-hit assertion currently uses the
expression ">= 0" which is too weak; locate the assertion that checks for a
cache hit (the line asserting ">= 0") in tests/test_openai_agents.py and change
it to use "> 0" so the second call is enforced as a cache read per the test
docstring; update any variable names involved (e.g., cache_hit_count or similar)
accordingly and run the test to confirm it now fails when there is no cache hit.
- Around line 544-545: The VCR configuration in conftest.py only filters
["authorization", "api-key"] and misses CloudFlare cookies, which are present in
cassettes like test_reasoning_output_tokens_recorded.yaml; update the VCR
config's filter_headers list to include "cookie" (i.e., modify the
filter_headers variable/setting in the VCR setup in conftest.py to add "cookie")
and then re-run the test suite to re-record the affected cassettes so
session/tracking cookies are scrubbed from the YAML files.

In
`@packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py`:
- Line 82: Add a deprecated alias for the renamed constant so existing imports
still work: in opentelemetry.semconv_ai.__init__.py (next to
GEN_AI_USAGE_REASONING_OUTPUT_TOKENS) define the old symbol
SpanAttributes.GEN_AI_USAGE_REASONING_TOKENS (or simply
GEN_AI_USAGE_REASONING_TOKENS if constants are module-level) and assign it the
same string value as GEN_AI_USAGE_REASONING_OUTPUT_TOKENS, include a brief
comment noting it is deprecated and optionally emit a DeprecationWarning when
accessed to guide users to the new constant name.

---

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py`:
- Around line 554-557: The test currently picks a span by position
(response_spans[0]) which is brittle; replace that with a deterministic
selection that finds the span by the attribute you assert later (use
response_spans and the specific attribute key used in the assertions). For
example, replace uses of response_spans[0] in the test with code that selects
the span via next(s for s in response_spans if '<ATTRIBUTE_KEY>' in s.attributes
or s.attributes.get('<ATTRIBUTE_KEY>') == '<EXPECTED_VALUE>'), where
<ATTRIBUTE_KEY> is the attribute name asserted later in the test; do this for
both occurrences (around the response_spans usage at the two ranges noted) so
the test picks the span by attribute, not index.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 21e8e29b-0fa9-4aff-8aaf-b153db91fba1

📥 Commits

Reviewing files that changed from the base of the PR and between 6d3e696 and 5306120.

⛔ Files ignored due to path filters (28)

packages/opentelemetry-instrumentation-agno/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-anthropic/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-bedrock/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-chromadb/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-cohere/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-crewai/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-groq/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-haystack/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-lancedb/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-marqo/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-mcp/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-milvus/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-mistralai/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-ollama/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-openai-agents/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-openai/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-pinecone/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-qdrant/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-replicate/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-sagemaker/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-together/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-transformers/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-vertexai/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-voyageai/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-watsonx/uv.lock is excluded by !**/*.lock
packages/opentelemetry-instrumentation-writer/uv.lock is excluded by !**/*.lock
packages/sample-app/uv.lock is excluded by !**/*.lock
packages/traceloop-sdk/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (13)

packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py
packages/opentelemetry-instrumentation-openai-agents/pyproject.toml
packages/opentelemetry-instrumentation-openai-agents/tests/cassettes/test_openai_agents/test_cache_read_input_tokens_recorded.yaml
packages/opentelemetry-instrumentation-openai-agents/tests/cassettes/test_openai_agents/test_reasoning_output_tokens_recorded.yaml
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py
packages/opentelemetry-instrumentation-openai/tests/traces/test_azure.py
packages/opentelemetry-instrumentation-openai/tests/traces/test_chat.py
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/_testing.py
packages/traceloop-sdk/tests/test_manual.py
packages/traceloop-sdk/traceloop/sdk/tracing/manual.py

coderabbitai · 2026-05-12T13:11:23Z

+      set-cookie:
+      - __cf_bm=GqvMTTwxqvHlYO_zMGr15vhngiHBKowc4W9cytloB1U-1778513729.5012145-1.0.1.1-TuWmsWma1J3OWFenDizlhbf_5qCYt89CjxNLLDDm.zLzVxhm2ep.2mEgG6rExy5xHOM0Quwrvd09cIn.yYTv.TO2jCa5bit4sQJ6ZIeHevWLiejQyIXZYqEUyXQItPzY;
+        HttpOnly; SameSite=None; Secure; Path=/; Domain=api.openai.com; Expires=Mon,
+        11 May 2026 16:05:32 GMT


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove cookie material from test cassette.

This cassette stores request Cookie and response set-cookie values. Please redact/filter these headers and re-record to avoid leaking sensitive session artifacts and to improve replay stability.

Also applies to: 165-168, 190-191

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opentelemetry-instrumentation-openai-agents/tests/cassettes/test_openai_agents/test_cache_read_input_tokens_recorded.yaml` around lines 90 - 93, Remove all Cookie and set-cookie header values from the cassette test_openai_agents/test_cache_read_input_tokens_recorded.yaml (and the other occurrences noted) by redacting those header lines and re-recording the cassette; ensure the test harness that generates cassettes filters or normalizes "Cookie" request headers and "set-cookie" response headers before saving (update the cassette recording/filtering logic to strip or replace these headers with a stable placeholder) so future recordings do not include session cookies and re-run the tests to commit the sanitized cassette.

coderabbitai · 2026-05-12T13:11:24Z

+      Cookie:
+      - __cf_bm=GqvMTTwxqvHlYO_zMGr15vhngiHBKowc4W9cytloB1U-1778513729.5012145-1.0.1.1-TuWmsWma1J3OWFenDizlhbf_5qCYt89CjxNLLDDm.zLzVxhm2ep.2mEgG6rExy5xHOM0Quwrvd09cIn.yYTv.TO2jCa5bit4sQJ6ZIeHevWLiejQyIXZYqEUyXQItPzY


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Scrub sensitive/volatile headers from cassette recordings.

Cookie and request/infra identifiers should not be committed in cassettes; they are sensitive/ephemeral and make fixtures less stable. Please filter these headers during VCR recording and regenerate this cassette.

Also applies to: 60-64, 104-105

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opentelemetry-instrumentation-openai-agents/tests/cassettes/test_openai_agents/test_reasoning_output_tokens_recorded.yaml` around lines 16 - 17, The cassette test_openai_agents/test_reasoning_output_tokens_recorded.yaml contains sensitive/volatile headers (e.g., Cookie and request/infra identifiers) that must be filtered from VCR recordings; update the test recording configuration used for this suite to strip or replace Cookie and any request/infra ID headers (e.g., CF identifiers, X-Request-Id) during capture, re-record the cassette, and commit the regenerated YAML so the cassette no longer contains those sensitive values.

coderabbitai · 2026-05-12T13:11:24Z

+@pytest.mark.vcr
+def test_cache_read_input_tokens_recorded(exporter, test_agent):


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Locate the new cassettes referenced by these tests fd -a 'test_(cache_read_input_tokens_recorded|reasoning_output_tokens_recorded)\.yaml' \ packages/opentelemetry-instrumentation-openai-agents/tests/cassettes # Scan cassettes for common secret/PII indicators (review any hits; placeholders/masked values are acceptable) rg -n --iglob '*.yaml' -e 'Authorization|Bearer\s+[A-Za-z0-9\-_\.]+|api[_-]?key|sk-[A-Za-z0-9]|@|email' \ packages/opentelemetry-instrumentation-openai-agents/tests/cassettes

Repository: traceloop/openllmetry

Length of output: 365

🏁 Script executed:

#!/bin/bash # Check the cassette files for any sensitive content cd packages/opentelemetry-instrumentation-openai-agents/tests/cassettes/test_openai_agents # Show file sizes and examine key sections of the cassettes for cassette in test_cache_read_input_tokens_recorded.yaml test_reasoning_output_tokens_recorded.yaml; do if [ -f "$cassette" ]; then echo "=== $cassette ===" wc -l "$cassette" echo "--- First 50 lines ---" head -50 "$cassette" echo "--- Looking for potential secrets (detailed check) ---" rg -n 'sk_|Bearer|Authorization|api.?key|token|secret|password' "$cassette" || echo "No matches found" fi done

Repository: traceloop/openllmetry

Length of output: 6564

🏁 Script executed:

#!/bin/bash # Check the test file to see if VCR filters are configured cd packages/opentelemetry-instrumentation-openai-agents/tests # Search for VCR decorator and filter configuration around the test methods rg -B5 -A10 'def test_(cache_read_input_tokens_recorded|reasoning_output_tokens_recorded)' test_openai_agents.py

Repository: traceloop/openllmetry

Length of output: 1274

🏁 Script executed:

#!/bin/bash # Look for VCR configuration in conftest files fd conftest.py packages/opentelemetry-instrumentation-openai-agents/tests | head -5 # Check the conftest for VCR filter setup if [ -f packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py ]; then echo "=== conftest.py content ===" cat -n packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py fi

Repository: traceloop/openllmetry

Length of output: 11337

🏁 Script executed:

#!/bin/bash # Also check for pytest.ini or setup.cfg with VCR config find packages/opentelemetry-instrumentation-openai-agents/tests -maxdepth 1 $ -name "pytest.ini" -o -name "setup.cfg" -o -name "pyproject.toml" $ -exec echo "=== {} ===" \; -exec cat {} \;

Repository: traceloop/openllmetry

Length of output: 47

VCR cassettes need improved filtering configuration and re-recording.

The conftest.py VCR config (lines 284–286) only filters ["authorization", "api-key"] headers, but the cassettes contain unfiltered CloudFlare bot management cookies (e.g., __cf_bm=... in test_reasoning_output_tokens_recorded.yaml). Add "cookie" to the filter_headers list and re-record the cassettes to ensure all session/tracking data is properly scrubbed.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py` around lines 544 - 545, The VCR configuration in conftest.py only filters ["authorization", "api-key"] and misses CloudFlare cookies, which are present in cassettes like test_reasoning_output_tokens_recorded.yaml; update the VCR config's filter_headers list to include "cookie" (i.e., modify the filter_headers variable/setting in the VCR setup in conftest.py to add "cookie") and then re-run the test suite to re-record the affected cassettes so session/tracking cookies are scrubbed from the YAML files.

coderabbitai · 2026-05-12T13:11:24Z

+    """Cache read tokens should be > 0 on the second call — OpenAI writes to cache
+    on the first call and serves from it on the second."""


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Cache-hit assertion is weaker than the test’s stated intent.

Line 559 uses >= 0, but the docstring (Lines 546-547) explicitly says the second call should be a cache read (> 0). This can let a non-cache-hit pass.

Proposed tightening

- assert response_span.attributes[GenAIAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS] >= 0 + assert response_span.attributes[GenAIAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS] > 0

Also applies to: 559-559

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py` around lines 546 - 547, The test's cache-hit assertion currently uses the expression ">= 0" which is too weak; locate the assertion that checks for a cache hit (the line asserting ">= 0") in tests/test_openai_agents.py and change it to use "> 0" so the second call is enforced as a cache read per the test docstring; update any variable names involved (e.g., cache_hit_count or similar) accordingly and run the test to confirm it now fails when there is no cache hit.

coderabbitai · 2026-05-12T13:11:24Z

    GEN_AI_CONTENT_COMPLETION_CHUNK = "gen_ai.content.completion.chunk"
    GEN_AI_REQUEST_REASONING_EFFORT = "gen_ai.request.reasoning_effort"
-    GEN_AI_USAGE_REASONING_TOKENS = "gen_ai.usage.reasoning_tokens"
+    GEN_AI_USAGE_REASONING_OUTPUT_TOKENS = "gen_ai.usage.reasoning.output_tokens"


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve backward compatibility for the renamed reasoning constant.

Dropping SpanAttributes.GEN_AI_USAGE_REASONING_TOKENS can break consumers importing that symbol. Keep a deprecated alias pointing to the new key to avoid runtime import breaks.

Proposed compatibility patch

class SpanAttributes: @@ GEN_AI_USAGE_REASONING_OUTPUT_TOKENS = "gen_ai.usage.reasoning.output_tokens" + # Backward-compatible alias (deprecated): remove in next major version + GEN_AI_USAGE_REASONING_TOKENS = GEN_AI_USAGE_REASONING_OUTPUT_TOKENS

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

GEN_AI_USAGE_REASONING_OUTPUT_TOKENS = "gen_ai.usage.reasoning.output_tokens"

GEN_AI_USAGE_REASONING_OUTPUT_TOKENS = "gen_ai.usage.reasoning.output_tokens"

# Backward-compatible alias (deprecated): remove in next major version

GEN_AI_USAGE_REASONING_TOKENS = GEN_AI_USAGE_REASONING_OUTPUT_TOKENS

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py` at line 82, Add a deprecated alias for the renamed constant so existing imports still work: in opentelemetry.semconv_ai.__init__.py (next to GEN_AI_USAGE_REASONING_OUTPUT_TOKENS) define the old symbol SpanAttributes.GEN_AI_USAGE_REASONING_TOKENS (or simply GEN_AI_USAGE_REASONING_TOKENS if constants are module-level) and assign it the same string value as GEN_AI_USAGE_REASONING_OUTPUT_TOKENS, include a brief comment noting it is deprecated and optionally emit a DeprecationWarning when accessed to guide users to the new constant name.

doronkopit5 · 2026-05-12T13:44:56Z

Reviewed the change — direction is right (aligning attribute names with the OTel GenAI spec). A few things worth addressing:

1. Two sources of truth for the cache constants

The same constants exist in both places, resolving to the same string today:

opentelemetry.semconv._incubating.attributes.gen_ai_attributes.GEN_AI_USAGE_CACHE_*_INPUT_TOKENS (upstream OTel)
opentelemetry.semconv_ai.SpanAttributes.GEN_AI_USAGE_CACHE_*_INPUT_TOKENS (this repo)

The PR mixes both: _hooks.py and manual.py use GenAIAttributes (upstream), while pre-existing code in chat_wrappers.py, responses_wrappers.py, anthropic, langchain uses SpanAttributes from semconv_ai. Both work, but it's nicer to have one source of truth.

I'd lean toward standardizing on the upstream OTel-native GenAIAttributes here — that's the spec source — but happy to leave the call to you. Whatever you pick, please apply it consistently across the touched files so callers don't have to know which path leads where.

2. `traceloop-sdk` needs an explicit upstream semconv floor

manual.py now uses GenAIAttributes.GEN_AI_USAGE_CACHE_*_INPUT_TOKENS, but packages/traceloop-sdk/pyproject.toml doesn't pin opentelemetry-semantic-conventions. With <0.62b1 installed transitively, that attribute either doesn't exist (→ AttributeError) or has the old underscore string (silently wrong telemetry).

Please add opentelemetry-semantic-conventions>=0.62b1 to packages/traceloop-sdk/pyproject.toml (same bump you did for openai-agents).

3. Removing `GEN_AI_USAGE_REASONING_TOKENS` is a breaking change with no alias

Deleting the class attribute from SpanAttributes will break any downstream code importing SpanAttributes.GEN_AI_USAGE_REASONING_TOKENS with AttributeError. The repo already has prior art for graceful renames (see the "Legacy LLM_* aliases" block in semconv_ai/__init__.py). Suggest keeping a deprecation alias:

GEN_AI_USAGE_REASONING_OUTPUT_TOKENS = "gen_ai.usage.reasoning.output_tokens"
# Deprecated alias; use GEN_AI_USAGE_REASONING_OUTPUT_TOKENS
GEN_AI_USAGE_REASONING_TOKENS = GEN_AI_USAGE_REASONING_OUTPUT_TOKENS

Also note that the string value changes too (gen_ai.usage.reasoning_tokens → gen_ai.usage.reasoning.output_tokens), which is a wire-level break for dashboards. Worth a line in MIGRATION.md under the "Changed string values" section.

5. New test assertions don't actually verify the feature

assert response_span.attributes[GenAIAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS] >= 0
assert response_span.attributes[SpanAttributes.GEN_AI_USAGE_REASONING_OUTPUT_TOKENS] >= 0

Both >= 0 assertions are trivially true once the attribute is recorded. The cache test is intentionally structured (first call writes cache → clear() → second call reads) precisely to exercise caching — so it should assert > 0 to prove the second call actually got a cache hit. Same for o3-mini reasoning tokens, which per the docstring always reports > 0. Tightening to > 0 makes these tests actually catch regressions.

Dvir Rezenman added 2 commits May 12, 2026 12:11

feat(semconv): align reasoning and cache token attributes with OTel G…

016d4c3

…enAI spec

Using GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS corrent attribute in m…

5306120

…anuel instrumentation

dvirski requested review from doronkopit5, galzilber, max-deygin-traceloop, netanel-tl and nina-kollman May 12, 2026 13:03

coderabbitai Bot reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(openai): align reasoning and cache token attributes with OTel GenAI spec#4130

fix(openai): align reasoning and cache token attributes with OTel GenAI spec#4130
dvirski wants to merge 2 commits into
mainfrom
dr/feat(semconv)-align-reasoning-and-cache-token-attributes-with-OTel-GenAI-spec

dvirski commented May 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

CLAassistant commented May 12, 2026

Uh oh!

coderabbitai Bot commented May 12, 2026

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 12, 2026

Uh oh!

coderabbitai Bot May 12, 2026

Uh oh!

coderabbitai Bot May 12, 2026

Uh oh!

coderabbitai Bot May 12, 2026

Uh oh!

coderabbitai Bot May 12, 2026

Uh oh!

doronkopit5 commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		Cookie:
		- __cf_bm=GqvMTTwxqvHlYO_zMGr15vhngiHBKowc4W9cytloB1U-1778513729.5012145-1.0.1.1-TuWmsWma1J3OWFenDizlhbf_5qCYt89CjxNLLDDm.zLzVxhm2ep.2mEgG6rExy5xHOM0Quwrvd09cIn.yYTv.TO2jCa5bit4sQJ6ZIeHevWLiejQyIXZYqEUyXQItPzY

		@pytest.mark.vcr
		def test_cache_read_input_tokens_recorded(exporter, test_agent):

		"""Cache read tokens should be > 0 on the second call — OpenAI writes to cache
		on the first call and serves from it on the second."""

Conversation

dvirski commented May 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

CLAassistant commented May 12, 2026

Uh oh!

coderabbitai Bot commented May 12, 2026

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

doronkopit5 commented May 12, 2026

1. Two sources of truth for the cache constants

2. traceloop-sdk needs an explicit upstream semconv floor

3. Removing GEN_AI_USAGE_REASONING_TOKENS is a breaking change with no alias

5. New test assertions don't actually verify the feature

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dvirski commented May 12, 2026 •

edited by coderabbitai Bot

Loading

2. `traceloop-sdk` needs an explicit upstream semconv floor

3. Removing `GEN_AI_USAGE_REASONING_TOKENS` is a breaking change with no alias