feat(xai): update model YAMLs [bot]#1201
Conversation
|
/test-models |
Gateway test results
Failures (6)
ErrorCode snippetimport boto3
from botocore.config import Config
_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_model = "test-v2-xai/grok-3-mini-high"
client = boto3.client(
"bedrock-runtime",
region_name="us-east-1",
endpoint_url=_endpoint,
aws_access_key_id="dummy",
aws_secret_access_key="dummy",
config=Config(inject_host_prefix=False),
)
def _add_auth_header(request, **kwargs):
request.headers["x-tfy-api-key"] = _api_key
client.meta.events.register("before-sign.bedrock-runtime.*", _add_auth_header)
messages = [
{"role": "user", "content": [{"text": "Hi"}]},
{"role": "assistant", "content": [{"text": "Hi, how can I help you"}]},
{"role": "user", "content": [{"text": "How to calculate 3^3^3^3? Think step by step and show all reasoning."}]},
]
system = [{"text": "You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps."}]
response = client.converse_stream(
modelId=_model,
system=system,
messages=messages,
)
_events = []
for _event in response["stream"]:
_events.append(_event)
if "contentBlockDelta" in _event:
_delta = _event["contentBlockDelta"].get("delta", {})
if "reasoningContent" in _delta:
print(_delta["reasoningContent"].get("text", ""), end="", flush=True)
if "text" in _delta:
print(_delta["text"], end="", flush=True)
_reasoning_detected = False
for _event in _events:
if "contentBlockDelta" in _event:
_delta = _event["contentBlockDelta"].get("delta", {})
if "text" in _delta:
print(_delta["text"], end="", flush=True)
if "reasoningContent" in _delta:
_reasoning_detected = True
_reasoning = _delta["reasoningContent"]
if "text" in _reasoning:
print(_reasoning["text"], end="", flush=True)
if "contentBlockStart" in _event:
_start = _event["contentBlockStart"].get("start", {})
if "reasoningContent" in _start:
_reasoning_detected = True
if "metadata" in _event:
_usage = _event["metadata"].get("usage", {})
if _usage.get("reasoning_tokens") or _usage.get("reasoningTokens"):
_reasoning_detected = True
if not _reasoning_detected:
raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in Bedrock stream")
print("\nVALIDATION: reasoning stream SUCCESS")Output
ErrorCode snippetimport boto3
from botocore.config import Config
_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_model = "test-v2-xai/grok-3-mini-high"
client = boto3.client(
"bedrock-runtime",
region_name="us-east-1",
endpoint_url=_endpoint,
aws_access_key_id="dummy",
aws_secret_access_key="dummy",
config=Config(inject_host_prefix=False),
)
def _add_auth_header(request, **kwargs):
request.headers["x-tfy-api-key"] = _api_key
client.meta.events.register("before-sign.bedrock-runtime.*", _add_auth_header)
messages = [
{"role": "user", "content": [{"text": "Hi"}]},
{"role": "assistant", "content": [{"text": "Hi, how can I help you"}]},
{"role": "user", "content": [{"text": "How to calculate 3^3^3^3? Think step by step and show all reasoning."}]},
]
system = [{"text": "You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps."}]
response = client.converse(
modelId=_model,
system=system,
messages=messages,
)
_content = response["output"]["message"]["content"]
for _block in _content:
if "reasoningContent" in _block:
print(_block["reasoningContent"]["reasoningText"]["text"])
if "text" in _block:
print(_block["text"])
_content = response["output"]["message"]["content"]
_reasoning_detected = False
for _block in _content:
if "text" in _block:
print(_block["text"])
if "reasoningContent" in _block:
_reasoning_detected = True
_reasoning = _block["reasoningContent"]
if "reasoningText" in _reasoning:
print(f"Reasoning: {_reasoning['reasoningText']['text'][:200]}...")
_usage = response.get("usage", {})
if _usage.get("reasoning_tokens") or _usage.get("reasoningTokens"):
_reasoning_detected = True
if not _reasoning_detected:
print("Response: ", response)
raise Exception("VALIDATION FAILED: reasoning - no reasoning information in Bedrock response")
print("VALIDATION: reasoning SUCCESS")Output
ErrorCode snippetimport boto3
from botocore.config import Config
_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_model = "test-v2-xai/grok-3-mini-fast-high"
client = boto3.client(
"bedrock-runtime",
region_name="us-east-1",
endpoint_url=_endpoint,
aws_access_key_id="dummy",
aws_secret_access_key="dummy",
config=Config(inject_host_prefix=False),
)
def _add_auth_header(request, **kwargs):
request.headers["x-tfy-api-key"] = _api_key
client.meta.events.register("before-sign.bedrock-runtime.*", _add_auth_header)
messages = [
{"role": "user", "content": [{"text": "Hi"}]},
{"role": "assistant", "content": [{"text": "Hi, how can I help you"}]},
{"role": "user", "content": [{"text": "How to calculate 3^3^3^3? Think step by step and show all reasoning."}]},
]
system = [{"text": "You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps."}]
response = client.converse(
modelId=_model,
system=system,
messages=messages,
)
_content = response["output"]["message"]["content"]
for _block in _content:
if "reasoningContent" in _block:
print(_block["reasoningContent"]["reasoningText"]["text"])
if "text" in _block:
print(_block["text"])
_content = response["output"]["message"]["content"]
_reasoning_detected = False
for _block in _content:
if "text" in _block:
print(_block["text"])
if "reasoningContent" in _block:
_reasoning_detected = True
_reasoning = _block["reasoningContent"]
if "reasoningText" in _reasoning:
print(f"Reasoning: {_reasoning['reasoningText']['text'][:200]}...")
_usage = response.get("usage", {})
if _usage.get("reasoning_tokens") or _usage.get("reasoningTokens"):
_reasoning_detected = True
if not _reasoning_detected:
print("Response: ", response)
raise Exception("VALIDATION FAILED: reasoning - no reasoning information in Bedrock response")
print("VALIDATION: reasoning SUCCESS")Output
ErrorCode snippetimport boto3
from botocore.config import Config
_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_model = "test-v2-xai/grok-3-mini-fast-high"
client = boto3.client(
"bedrock-runtime",
region_name="us-east-1",
endpoint_url=_endpoint,
aws_access_key_id="dummy",
aws_secret_access_key="dummy",
config=Config(inject_host_prefix=False),
)
def _add_auth_header(request, **kwargs):
request.headers["x-tfy-api-key"] = _api_key
client.meta.events.register("before-sign.bedrock-runtime.*", _add_auth_header)
messages = [
{"role": "user", "content": [{"text": "Hi"}]},
{"role": "assistant", "content": [{"text": "Hi, how can I help you"}]},
{"role": "user", "content": [{"text": "How to calculate 3^3^3^3? Think step by step and show all reasoning."}]},
]
system = [{"text": "You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps."}]
response = client.converse_stream(
modelId=_model,
system=system,
messages=messages,
)
_events = []
for _event in response["stream"]:
_events.append(_event)
if "contentBlockDelta" in _event:
_delta = _event["contentBlockDelta"].get("delta", {})
if "reasoningContent" in _delta:
print(_delta["reasoningContent"].get("text", ""), end="", flush=True)
if "text" in _delta:
print(_delta["text"], end="", flush=True)
_reasoning_detected = False
for _event in _events:
if "contentBlockDelta" in _event:
_delta = _event["contentBlockDelta"].get("delta", {})
if "text" in _delta:
print(_delta["text"], end="", flush=True)
if "reasoningContent" in _delta:
_reasoning_detected = True
_reasoning = _delta["reasoningContent"]
if "text" in _reasoning:
print(_reasoning["text"], end="", flush=True)
if "contentBlockStart" in _event:
_start = _event["contentBlockStart"].get("start", {})
if "reasoningContent" in _start:
_reasoning_detected = True
if "metadata" in _event:
_usage = _event["metadata"].get("usage", {})
if _usage.get("reasoning_tokens") or _usage.get("reasoningTokens"):
_reasoning_detected = True
if not _reasoning_detected:
raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in Bedrock stream")
print("\nVALIDATION: reasoning stream SUCCESS")Output
ErrorCode snippetimport boto3
from botocore.config import Config
_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_model = "test-v2-xai/grok-3-mini-high-beta"
client = boto3.client(
"bedrock-runtime",
region_name="us-east-1",
endpoint_url=_endpoint,
aws_access_key_id="dummy",
aws_secret_access_key="dummy",
config=Config(inject_host_prefix=False),
)
def _add_auth_header(request, **kwargs):
request.headers["x-tfy-api-key"] = _api_key
client.meta.events.register("before-sign.bedrock-runtime.*", _add_auth_header)
messages = [
{"role": "user", "content": [{"text": "Hi"}]},
{"role": "assistant", "content": [{"text": "Hi, how can I help you"}]},
{"role": "user", "content": [{"text": "How to calculate 3^3^3^3? Think step by step and show all reasoning."}]},
]
system = [{"text": "You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps."}]
response = client.converse(
modelId=_model,
system=system,
messages=messages,
)
_content = response["output"]["message"]["content"]
for _block in _content:
if "reasoningContent" in _block:
print(_block["reasoningContent"]["reasoningText"]["text"])
if "text" in _block:
print(_block["text"])
_content = response["output"]["message"]["content"]
_reasoning_detected = False
for _block in _content:
if "text" in _block:
print(_block["text"])
if "reasoningContent" in _block:
_reasoning_detected = True
_reasoning = _block["reasoningContent"]
if "reasoningText" in _reasoning:
print(f"Reasoning: {_reasoning['reasoningText']['text'][:200]}...")
_usage = response.get("usage", {})
if _usage.get("reasoning_tokens") or _usage.get("reasoningTokens"):
_reasoning_detected = True
if not _reasoning_detected:
print("Response: ", response)
raise Exception("VALIDATION FAILED: reasoning - no reasoning information in Bedrock response")
print("VALIDATION: reasoning SUCCESS")Output
ErrorCode snippetimport boto3
from botocore.config import Config
_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_model = "test-v2-xai/grok-3-mini-high-beta"
client = boto3.client(
"bedrock-runtime",
region_name="us-east-1",
endpoint_url=_endpoint,
aws_access_key_id="dummy",
aws_secret_access_key="dummy",
config=Config(inject_host_prefix=False),
)
def _add_auth_header(request, **kwargs):
request.headers["x-tfy-api-key"] = _api_key
client.meta.events.register("before-sign.bedrock-runtime.*", _add_auth_header)
messages = [
{"role": "user", "content": [{"text": "Hi"}]},
{"role": "assistant", "content": [{"text": "Hi, how can I help you"}]},
{"role": "user", "content": [{"text": "How to calculate 3^3^3^3? Think step by step and show all reasoning."}]},
]
system = [{"text": "You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps."}]
response = client.converse_stream(
modelId=_model,
system=system,
messages=messages,
)
_events = []
for _event in response["stream"]:
_events.append(_event)
if "contentBlockDelta" in _event:
_delta = _event["contentBlockDelta"].get("delta", {})
if "reasoningContent" in _delta:
print(_delta["reasoningContent"].get("text", ""), end="", flush=True)
if "text" in _delta:
print(_delta["text"], end="", flush=True)
_reasoning_detected = False
for _event in _events:
if "contentBlockDelta" in _event:
_delta = _event["contentBlockDelta"].get("delta", {})
if "text" in _delta:
print(_delta["text"], end="", flush=True)
if "reasoningContent" in _delta:
_reasoning_detected = True
_reasoning = _delta["reasoningContent"]
if "text" in _reasoning:
print(_reasoning["text"], end="", flush=True)
if "contentBlockStart" in _event:
_start = _event["contentBlockStart"].get("start", {})
if "reasoningContent" in _start:
_reasoning_detected = True
if "metadata" in _event:
_usage = _event["metadata"].get("usage", {})
if _usage.get("reasoning_tokens") or _usage.get("reasoningTokens"):
_reasoning_detected = True
if not _reasoning_detected:
raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in Bedrock stream")
print("\nVALIDATION: reasoning stream SUCCESS")OutputSuccesses (46)
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output
Output |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 3ef7309. Configure here.
| status: active | ||
| supportedModes: | ||
| - chat | ||
| thinking: true |
There was a problem hiding this comment.
Sources field accidentally removed from model YAML
Medium Severity
The sources field (previously pointing to https://docs.x.ai/developers/models/grok-3-mini-fast) was deleted from grok-3-mini-high.yaml. The two sibling model files updated in this same PR (grok-3-mini-fast-high.yaml and grok-3-mini-high-beta.yaml) both retain their sources field. This looks like an unintentional removal during the update, causing loss of provenance metadata for this model.
Reviewed by Cursor Bugbot for commit 3ef7309. Configure here.
| output_cost_per_token: 2.5e-6 | ||
| region: "*" | ||
| features: | ||
| - prompt_caching |
There was a problem hiding this comment.
Incomplete features list added to grok-4.3 model
Medium Severity
The newly added features section in grok-4.3.yaml only includes prompt_caching, while the corresponding grok-4.3-latest.yaml (same model, same pricing) declares function_calling, tool_choice, structured_output, system_messages, and prompt_caching. Adding an explicit but incomplete features list is potentially worse than having none, because consumers may treat the presence of the key as authoritative and conclude the model lacks capabilities it actually supports.
Reviewed by Cursor Bugbot for commit 3ef7309. Configure here.


Auto-generated by poc-agent for provider
xai.Note
Low Risk
Declarative provider catalog changes only; no runtime code paths, though cost estimates may shift where tiered pricing and new flags are applied.
Overview
Auto-generated updates to several xAI Grok model YAML entries so routing, limits, and billing metadata match current provider docs.
Grok 3 Mini variants (
grok-3-mini-fast-high,grok-3-mini-high-beta,grok-3-mini-high) now declare a 1M tokencontext_window,provisioning: serverless,supportedModes: [chat], andthinking: true.grok-3-mini-highadditionally marksprompt_caching, adds tiered pricing above 200k tokens for cache read / input / output, and drops itssourcesblock.grok-4.3gainsprompt_cachinginfeatures; other fields in the diff are unchanged.Reviewed by Cursor Bugbot for commit 3ef7309. Bugbot is set up for automated code reviews on this repo. Configure here.