Skip to content

docs: add LLM metrics migration guide for Python SDK v3#1570

Draft
shrey150 wants to merge 2 commits intomainfrom
shrey/check-v3-metrics-docs
Draft

docs: add LLM metrics migration guide for Python SDK v3#1570
shrey150 wants to merge 2 commits intomainfrom
shrey/check-v3-metrics-docs

Conversation

@shrey150
Copy link
Copy Markdown
Contributor

@shrey150 shrey150 commented Jan 20, 2026

Summary

Document the changes to LLM metrics access in Stagehand Python SDK v3, including how metrics are now accessed via the usage field on operation responses instead of a separate replay endpoint.

Changes

  • Added section "10. LLM Metrics" with metric name mapping and code examples
  • Updated quick reference table with metrics mapping
  • Added troubleshooting entry for the 404 endpoint issue

Test plan

  • Verify documentation renders correctly
  • Check that code examples are accurate and runnable
  • Confirm the quick reference table is complete

🤖 Generated with Claude Code


Summary by cubic

Adds a migration guide for LLM metrics in Python SDK v3, moving metrics from the old replay endpoint to the usage field on operation responses. Includes code examples and updates to help teams upgrade quickly.

  • Migration
    • New “LLM Metrics” section with v2→v3 name mapping (input_tokens, output_tokens, inference_time_ms).
    • Example showing metrics access via response.data.result.usage on act().
    • Documents new fields: cached_input_tokens and reasoning_tokens.
    • Updated quick reference and troubleshooting for 404 on the deprecated metrics endpoint.

Written for commit a8be963. Summary will update on new commits.

Document the changes to LLM metrics access in v3, including:
- New usage field on operation responses (act, observe, extract, execute)
- Metric name mapping (input_tokens, output_tokens, inference_time_ms)
- New fields for cached_input_tokens and reasoning_tokens
- Updated quick reference and troubleshooting section

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Jan 20, 2026

⚠️ No Changeset found

Latest commit: a8be963

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jan 20, 2026

Greptile Summary

Documents how LLM metrics access changed in Python SDK v3, where metrics moved from a separate stagehand.metrics property or replay endpoint to being included directly in the usage field of operation responses.

Key Changes

  • Metric Name Mappings: Documented the renaming of metrics (act_prompt_tokensinput_tokens, etc.) and introduced new fields (cached_input_tokens, reasoning_tokens)
  • Access Pattern: Showed how to access metrics via response.data.result.usage instead of stagehand.metrics
  • Code Examples: Provided both basic act() operation and execute() agent examples with proper snake_case naming conventions
  • Quick Reference Update: Added metrics mapping to the method mapping table
  • Troubleshooting: Added entry explaining the 404 error users may encounter when trying to use the old replay endpoint

The documentation is well-structured, uses correct Python syntax with snake_case, and provides clear migration guidance for users transitioning from v2 to v3.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • Documentation-only changes that accurately document the LLM metrics migration path from v2 to v3. Code examples follow proper Python conventions (snake_case), align with existing SDK documentation patterns, and provide clear, actionable guidance. No functional code changes.
  • No files require special attention

Important Files Changed

Filename Overview
packages/docs/v3/migrations/python.mdx Added comprehensive LLM metrics migration section with metric mappings, code examples, and troubleshooting entry

Sequence Diagram

sequenceDiagram
    participant User as Developer
    participant Client as Stagehand Client
    participant API as Stagehand API
    participant LLM as LLM Service

    Note over User,LLM: LLM Metrics Migration (v2 → v3)

    rect rgb(240, 240, 240)
        Note right of User: Old SDK (v2) Flow
        User->>Client: stagehand.metrics
        Client-->>User: Metrics from replay endpoint
    end

    rect rgb(220, 255, 220)
        Note right of User: New SDK (v3) Flow
        User->>Client: client.sessions.act(...)
        Client->>API: POST /sessions/{id}/act
        API->>LLM: Process act request
        LLM-->>API: Response + usage metrics
        API-->>Client: Response with usage field
        Client-->>User: response.data.result.usage
        Note over User: Access metrics directly:<br/>- input_tokens<br/>- output_tokens<br/>- inference_time_ms<br/>- cached_input_tokens<br/>- reasoning_tokens
    end
Loading

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant