Skip to content

feat(google-vertex): add new models [bot]#977

Open
models-bot[bot] wants to merge 2 commits intomainfrom
bot/add-google-vertex-20260508-000417
Open

feat(google-vertex): add new models [bot]#977
models-bot[bot] wants to merge 2 commits intomainfrom
bot/add-google-vertex-20260508-000417

Conversation

@models-bot
Copy link
Copy Markdown
Contributor

@models-bot models-bot Bot commented May 8, 2026

Auto-generated by model-addition-agent for provider google-vertex.


Note

Low Risk
Low risk: this PR only adds new model metadata/config YAMLs and does not change runtime logic.

Overview
Adds new Google Vertex model definition YAMLs for google/gemini-3.1-flash-lite and moonshotai/kimi-k2-6.

The gemini-3.1-flash-lite config includes pricing by region, supported features/modalities, large context/token limits, and marks the model as preview with thinking enabled; kimi-k2-6 is added as a minimal stub with mode: unknown.

Reviewed by Cursor Bugbot for commit 877b9a7. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

/test-models

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 877b9a7. Configure here.

Comment thread providers/google-vertex/google/gemini-3.1-flash-lite.yaml
@harshiv-26
Copy link
Copy Markdown
Collaborator

Gateway test results

  • Total: 20
  • Passed: 10
  • Failed: 10
  • Validation failed: 0
  • Errored: 0
  • Skipped: 0
  • Success rate: 50.0%
Provider Model Scenarios
google-vertex google/gemini-3.1-flash-lite success: structured-output:stream, json-output, tool-call, params, json-output:stream, params:stream, tool-call:stream, structured-output, reasoning:stream, reasoning

failure: structured-output:google-genai, tool-call:google-genai, reasoning:stream:google-genai, reasoning:google-genai, params:stream:google-genai, structured-output:stream:google-genai, json-output:google-genai, params:google-genai, json-output:stream:google-genai, tool-call:stream:google-genai
Failures (10)

google-vertex/google/gemini-3.1-flash-lite — structured-output:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp6qrfmifd/snippet.py", line 46, in <module>
    response = client.models.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6331, in generate_content
    response = self._generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4767, in _generate_content
    response = self._api_client.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1605, in request
    response = self._request(http_request, http_options, stream=False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

response_schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "date": {"type": "string"},
        "participants": {
            "type": "array",
            "items": {"type": "string"},
        },
    },
    "required": ["name", "date", "participants"],
}

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="Alice and Bob are going to a science fair on Friday.")]),
]

config = types.GenerateContentConfig(
    system_instruction="Extract the event information as a structured CalendarEvent JSON object.",
    response_mime_type="application/json",
    response_json_schema=response_schema,
)

response = client.models.generate_content(
    model=_model_id,
    contents=contents,
    config=config,
)

print(response.text)

import json as _json

_text = response.text
print(_text)

if not _text:
    raise Exception("VALIDATION FAILED: structured-output - GenAI response text is empty")

_parsed = _json.loads(_text)
print(_json.dumps(_parsed, indent=2))

if "name" not in _parsed or "date" not in _parsed or "participants" not in _parsed:
    raise Exception("VALIDATION FAILED: structured-output - missing expected fields (name, date, participants)")

if not isinstance(_parsed.get("participants"), list):
    raise Exception("VALIDATION FAILED: structured-output - 'participants' is not a list, schema not enforced")

print("VALIDATION: structured-output SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — tool-call:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp8al4xk1k/snippet.py", line 49, in <module>
    response = client.models.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6316, in generate_content
    return self._generate_content(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4767, in _generate_content
    response = self._api_client.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1605, in request
    response = self._request(http_request, http_options, stream=False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

get_weather = types.FunctionDeclaration(
    name="get_weather",
    description="Get the current weather for a location.",
    parameters_json_schema={
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The city name, e.g. London",
            },
        },
        "required": ["location"],
    },
)

tool = types.Tool(function_declarations=[get_weather])

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text.")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant with access to tools. You MUST strictly use the provided tools to answer. Never respond with plain text when a tool is available.",
    tools=[tool],
)

response = client.models.generate_content(
    model=_model_id,
    contents=contents,
    config=config,
)

for part in response.candidates[0].content.parts:
    if part.function_call:
        print(f"Tool: {part.function_call.name}")
        print(f"Args: {part.function_call.args}")
    elif part.text:
        print(part.text)

_parts = response.candidates[0].content.parts
_function_calls = [p for p in _parts if p.function_call]

if _function_calls:
    for _fc in _function_calls:
        print(f"Tool: {_fc.function_call.name}")
        print(f"Args: {_fc.function_call.args}")
else:
    _text_parts = [p.text for p in _parts if p.text]
    print("\n".join(_text_parts))

if not _function_calls:
    raise Exception("VALIDATION FAILED: tool-call - no function calls in GenAI response")
print("VALIDATION: tool-call SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — reasoning:stream:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpdu3dq82s/snippet.py", line 36, in <module>
    for chunk in client.models.generate_content_stream(
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6509, in generate_content_stream
    for chunk in response:
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4875, in _generate_content_stream
    for response in self._api_client.request_streamed(
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1622, in request_streamed
    session_response = self._request(http_request, http_options, stream=True)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="How to calculate 3^3^3^3? Think step by step and show all reasoning.")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps.",
    thinking_config=types.ThinkingConfig(
        include_thoughts=True,
        thinking_budget=5000,
    ),
)

_chunks = []
for chunk in client.models.generate_content_stream(
    model=_model_id,
    contents=contents,
    config=config,
):
    _chunks.append(chunk)
    if chunk.candidates and chunk.candidates[0].content and chunk.candidates[0].content.parts:
        for part in chunk.candidates[0].content.parts:
            if not part.text:
                continue
            if part.thought:
                print(f"[Thinking] {part.text}", end="", flush=True)
            else:
                print(part.text, end="", flush=True)

_thought_detected = False
for _chunk in _chunks:
    if not _chunk.candidates or not _chunk.candidates[0].content:
        continue
    for _part in _chunk.candidates[0].content.parts:
        if not _part.text:
            continue
        if _part.thought:
            _thought_detected = True
            print(_part.text, end="", flush=True)
        else:
            print(_part.text, end="", flush=True)

if not _thought_detected:
    _usage = getattr(_chunks[-1], "usage_metadata", None) if _chunks else None
    if _usage and getattr(_usage, "thoughts_token_count", 0):
        _thought_detected = True

if not _thought_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no thinking information in GenAI stream")
print("\nVALIDATION: reasoning stream SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — reasoning:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmptuxvvdr5/snippet.py", line 35, in <module>
    response = client.models.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6331, in generate_content
    response = self._generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4767, in _generate_content
    response = self._api_client.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1605, in request
    response = self._request(http_request, http_options, stream=False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="How to calculate 3^3^3^3? Think step by step and show all reasoning.")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps.",
    thinking_config=types.ThinkingConfig(
        include_thoughts=True,
        thinking_budget=5000,
    ),
)

response = client.models.generate_content(
    model=_model_id,
    contents=contents,
    config=config,
)

for part in response.candidates[0].content.parts:
    if not part.text:
        continue
    if part.thought:
        print(f"[Thinking] {part.text}")
    else:
        print(part.text)

_parts = response.candidates[0].content.parts
_thought_detected = False

for _part in _parts:
    if not _part.text:
        continue
    if _part.thought:
        _thought_detected = True
        print(f"Thinking: {_part.text[:200]}...")
    else:
        print(_part.text)

_usage = getattr(response, "usage_metadata", None)
if _usage and getattr(_usage, "thoughts_token_count", 0):
    _thought_detected = True

if not _thought_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no thinking information in GenAI response")
print("VALIDATION: reasoning SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — params:stream:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpz5ecwe_w/snippet.py", line 34, in <module>
    for chunk in client.models.generate_content_stream(
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6509, in generate_content_stream
    for chunk in response:
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4875, in _generate_content_stream
    for response in self._api_client.request_streamed(
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1622, in request_streamed
    session_response = self._request(http_request, http_options, stream=True)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="What is the capital of France?")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant.",
    max_output_tokens=256,
    temperature=0.7,
)

_chunks = []
for chunk in client.models.generate_content_stream(
    model=_model_id,
    contents=contents,
    config=config,
):
    _chunks.append(chunk)
    if chunk.text:
        print(chunk.text, end="", flush=True)

google-vertex/google/gemini-3.1-flash-lite — structured-output:stream:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmprhk_c90f/snippet.py", line 47, in <module>
    for chunk in client.models.generate_content_stream(
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6509, in generate_content_stream
    for chunk in response:
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4875, in _generate_content_stream
    for response in self._api_client.request_streamed(
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1622, in request_streamed
    session_response = self._request(http_request, http_options, stream=True)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

response_schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "date": {"type": "string"},
        "participants": {
            "type": "array",
            "items": {"type": "string"},
        },
    },
    "required": ["name", "date", "participants"],
}

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="Alice and Bob are going to a science fair on Friday.")]),
]

config = types.GenerateContentConfig(
    system_instruction="Extract the event information as a structured CalendarEvent JSON object.",
    response_mime_type="application/json",
    response_json_schema=response_schema,
)

_chunks = []
for chunk in client.models.generate_content_stream(
    model=_model_id,
    contents=contents,
    config=config,
):
    _chunks.append(chunk)
    if chunk.text:
        print(chunk.text, end="", flush=True)

import json as _json

_accumulated = ""
for _chunk in _chunks:
    if _chunk.text:
        _accumulated += _chunk.text

if not _accumulated:
    raise Exception("VALIDATION FAILED: structured-output stream - no content received from GenAI stream")

_parsed = _json.loads(_accumulated)
print(_json.dumps(_parsed, indent=2))

if "name" not in _parsed or "date" not in _parsed or "participants" not in _parsed:
    raise Exception("VALIDATION FAILED: structured-output stream - missing expected fields (name, date, participants)")

if not isinstance(_parsed.get("participants"), list):
    raise Exception("VALIDATION FAILED: structured-output stream - 'participants' is not a list, schema not enforced")

print("\nVALIDATION: structured-output stream SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — json-output:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp5cz_24cc/snippet.py", line 32, in <module>
    response = client.models.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6331, in generate_content
    response = self._generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4767, in _generate_content
    response = self._api_client.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1605, in request
    response = self._request(http_request, http_options, stream=False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="List 3 colors with their hex codes in JSON.")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant. Respond in JSON format.",
    response_mime_type="application/json",
)

response = client.models.generate_content(
    model=_model_id,
    contents=contents,
    config=config,
)

print(response.text)

import json as _json

_text = response.text
print(_text)

if not _text:
    raise Exception("VALIDATION FAILED: json-output - GenAI response text is empty")

_json.loads(_text)
print("VALIDATION: json-output SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — params:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp86idxugi/snippet.py", line 33, in <module>
    response = client.models.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6331, in generate_content
    response = self._generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4767, in _generate_content
    response = self._api_client.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1605, in request
    response = self._request(http_request, http_options, stream=False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="What is the capital of France?")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant.",
    max_output_tokens=256,
    temperature=0.7,
)

response = client.models.generate_content(
    model=_model_id,
    contents=contents,
    config=config,
)

for part in response.candidates[0].content.parts:
    if part.text:
        print(part.text)

google-vertex/google/gemini-3.1-flash-lite — json-output:stream:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp91nhgd2e/snippet.py", line 33, in <module>
    for chunk in client.models.generate_content_stream(
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6509, in generate_content_stream
    for chunk in response:
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4875, in _generate_content_stream
    for response in self._api_client.request_streamed(
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1622, in request_streamed
    session_response = self._request(http_request, http_options, stream=True)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="List 3 colors with their hex codes in JSON.")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant. Respond in JSON format.",
    response_mime_type="application/json",
)

_chunks = []
for chunk in client.models.generate_content_stream(
    model=_model_id,
    contents=contents,
    config=config,
):
    _chunks.append(chunk)
    if chunk.text:
        print(chunk.text, end="", flush=True)

import json as _json

_accumulated = ""
for _chunk in _chunks:
    if _chunk.text:
        _accumulated += _chunk.text

if not _accumulated:
    raise Exception("VALIDATION FAILED: json-output stream - no content received from GenAI stream")

_json.loads(_accumulated)
print("\nVALIDATION: json-output stream SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — tool-call:stream:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpcn90ep7s/snippet.py", line 50, in <module>
    for chunk in client.models.generate_content_stream(
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6478, in generate_content_stream
    yield from self._generate_content_stream(
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4875, in _generate_content_stream
    for response in self._api_client.request_streamed(
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1622, in request_streamed
    session_response = self._request(http_request, http_options, stream=True)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

get_weather = types.FunctionDeclaration(
    name="get_weather",
    description="Get the current weather for a location.",
    parameters_json_schema={
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The city name, e.g. London",
            },
        },
        "required": ["location"],
    },
)

tool = types.Tool(function_declarations=[get_weather])

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text.")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant with access to tools. You MUST strictly use the provided tools to answer. Never respond with plain text when a tool is available.",
    tools=[tool],
)

_chunks = []
for chunk in client.models.generate_content_stream(
    model=_model_id,
    contents=contents,
    config=config,
):
    _chunks.append(chunk)
    if chunk.candidates and chunk.candidates[0].content and chunk.candidates[0].content.parts:
        for part in chunk.candidates[0].content.parts:
            if part.function_call:
                print(f"Tool: {part.function_call.name}", flush=True)
                print(f"Args: {part.function_call.args}", flush=True)
            elif part.text:
                print(part.text, end="", flush=True)

_tool_use_detected = False
for _chunk in _chunks:
    if not _chunk.candidates or not _chunk.candidates[0].content:
        continue
    for _part in _chunk.candidates[0].content.parts:
        if _part.function_call:
            _tool_use_detected = True
            print(f"Tool: {_part.function_call.name}", flush=True)
            print(f"Args: {_part.function_call.args}", flush=True)
        elif _part.text:
            print(_part.text, end="", flush=True)

if not _tool_use_detected:
    raise Exception("VALIDATION FAILED: tool-call stream - no function calls in GenAI stream")
print("\nVALIDATION: tool-call stream SUCCESS")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant