Skip to content

fix: set use_native_token_count default to false#2284

Open
opieter-aws wants to merge 1 commit into
strands-agents:mainfrom
opieter-aws:opieter-aws/set-default-false
Open

fix: set use_native_token_count default to false#2284
opieter-aws wants to merge 1 commit into
strands-agents:mainfrom
opieter-aws:opieter-aws/set-default-false

Conversation

@opieter-aws
Copy link
Copy Markdown
Contributor

@opieter-aws opieter-aws commented May 12, 2026

Description

Changes the default value of use_native_token_count from True to False across all model providers (Bedrock, Anthropic, Gemini, LlamaCpp, OpenAI Responses).

Since v1.38.0, the event loop calls count_tokens() before every model invocation. When use_native_token_count=True (the previous default), this makes an additional network round-trip that processes the full message payload -- including images -- before the actual inference call. For multimodal/image workloads, this adds 25-50% per-invocation latency with no indication of why.

This change makes native token counting opt-in rather than opt-out, so users must explicitly set use_native_token_count=True to enable the native API calls.

Related Issues

Fixes #2277

Documentation PR

N/A

Type of Change

Bug fix

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@opieter-aws opieter-aws deployed to manual-approval May 12, 2026 20:57 — with GitHub Actions Active
@opieter-aws opieter-aws changed the title fix: Set use_native_token_count default to false fix: set use_native_token_count default to false May 12, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@opieter-aws opieter-aws marked this pull request as ready for review May 12, 2026 21:00
@github-actions
Copy link
Copy Markdown

Assessment: Approve

Clean, well-scoped fix that correctly changes use_native_token_count from opt-out to opt-in across all 5 model providers with native token counting support. The logic change from is False to is not True properly handles the None (unset) case, and the approach aligns with tenet #4 ("The obvious path is the happy path") — users shouldn't silently pay a 25-50% latency penalty by default.

Review Details
  • Correctness: The is not True check correctly treats both None (key absent) and False as "skip native counting," which is the right behavior for making this opt-in.
  • Consistency: All 5 providers with count_tokens overrides are updated uniformly. The other providers (OpenAI chat, LiteLLM, etc.) don't override count_tokens and are unaffected.
  • Testing: New test_skip_native_api_by_default tests verify the default behavior for each provider, and existing tests are properly updated to pass use_native_token_count=True explicitly.

Nice fix addressing a real user-reported latency regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] use_native_token_count=True default causes silent latency regression for image/multimodal workloads

1 participant