feat(google-vertex): add new models [bot] by models-bot[bot] · Pull Request #977 · truefoundry/models

models-bot · 2026-05-08T00:04:19Z

Auto-generated by model-addition-agent for provider google-vertex.

Note

Low Risk
Low risk: this PR only adds new model metadata/config YAMLs and does not change runtime logic.

Overview
Adds new Google Vertex model definition YAMLs for google/gemini-3.1-flash-lite and moonshotai/kimi-k2-6.

The gemini-3.1-flash-lite config includes pricing by region, supported features/modalities, large context/token limits, and marks the model as preview with thinking enabled; kimi-k2-6 is added as a minimal stub with mode: unknown.

^{Reviewed by Cursor Bugbot for commit 877b9a7. Bugbot is set up for automated code reviews on this repo. Configure here.}

github-actions · 2026-05-08T00:15:12Z

/test-models

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 877b9a7. Configure here.}

harshiv-26 · 2026-05-08T00:22:37Z

Gateway test results

Total: 20
Passed: 10
Failed: 10
Validation failed: 0
Errored: 0
Skipped: 0
Success rate: 50.0%

Provider	Model	Scenarios
`google-vertex`	`google/gemini-3.1-flash-lite`	success: structured-output:stream, json-output, tool-call, params, json-output:stream, params:stream, tool-call:stream, structured-output, reasoning:stream, reasoning failure: structured-output:google-genai, tool-call:google-genai, reasoning:stream:google-genai, reasoning:google-genai, params:stream:google-genai, structured-output:stream:google-genai, json-output:google-genai, params:google-genai, json-output:stream:google-genai, tool-call:stream:google-genai

Failures (10)

google-vertex/google/gemini-3.1-flash-lite — structured-output:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp6qrfmifd/snippet.py", line 46, in <module>
    response = client.models.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6331, in generate_content
    response = self._generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4767, in _generate_content
    response = self._api_client.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1605, in request
    response = self._request(http_request, http_options, stream=False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

response_schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "date": {"type": "string"},
        "participants": {
            "type": "array",
            "items": {"type": "string"},
        },
    },
    "required": ["name", "date", "participants"],
}

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="Alice and Bob are going to a science fair on Friday.")]),
]

config = types.GenerateContentConfig(
    system_instruction="Extract the event information as a structured CalendarEvent JSON object.",
    response_mime_type="application/json",
    response_json_schema=response_schema,
)

response = client.models.generate_content(
    model=_model_id,
    contents=contents,
    config=config,
)

print(response.text)

import json as _json

_text = response.text
print(_text)

if not _text:
    raise Exception("VALIDATION FAILED: structured-output - GenAI response text is empty")

_parsed = _json.loads(_text)
print(_json.dumps(_parsed, indent=2))

if "name" not in _parsed or "date" not in _parsed or "participants" not in _parsed:
    raise Exception("VALIDATION FAILED: structured-output - missing expected fields (name, date, participants)")

if not isinstance(_parsed.get("participants"), list):
    raise Exception("VALIDATION FAILED: structured-output - 'participants' is not a list, schema not enforced")

print("VALIDATION: structured-output SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — tool-call:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp8al4xk1k/snippet.py", line 49, in <module>
    response = client.models.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6316, in generate_content
    return self._generate_content(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4767, in _generate_content
    response = self._api_client.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1605, in request
    response = self._request(http_request, http_options, stream=False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

get_weather = types.FunctionDeclaration(
    name="get_weather",
    description="Get the current weather for a location.",
    parameters_json_schema={
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The city name, e.g. London",
            },
        },
        "required": ["location"],
    },
)

tool = types.Tool(function_declarations=[get_weather])

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text.")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant with access to tools. You MUST strictly use the provided tools to answer. Never respond with plain text when a tool is available.",
    tools=[tool],
)

response = client.models.generate_content(
    model=_model_id,
    contents=contents,
    config=config,
)

for part in response.candidates[0].content.parts:
    if part.function_call:
        print(f"Tool: {part.function_call.name}")
        print(f"Args: {part.function_call.args}")
    elif part.text:
        print(part.text)

_parts = response.candidates[0].content.parts
_function_calls = [p for p in _parts if p.function_call]

if _function_calls:
    for _fc in _function_calls:
        print(f"Tool: {_fc.function_call.name}")
        print(f"Args: {_fc.function_call.args}")
else:
    _text_parts = [p.text for p in _parts if p.text]
    print("\n".join(_text_parts))

if not _function_calls:
    raise Exception("VALIDATION FAILED: tool-call - no function calls in GenAI response")
print("VALIDATION: tool-call SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — reasoning:stream:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpdu3dq82s/snippet.py", line 36, in <module>
    for chunk in client.models.generate_content_stream(
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6509, in generate_content_stream
    for chunk in response:
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4875, in _generate_content_stream
    for response in self._api_client.request_streamed(
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1622, in request_streamed
    session_response = self._request(http_request, http_options, stream=True)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="How to calculate 3^3^3^3? Think step by step and show all reasoning.")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps.",
    thinking_config=types.ThinkingConfig(
        include_thoughts=True,
        thinking_budget=5000,
    ),
)

_chunks = []
for chunk in client.models.generate_content_stream(
    model=_model_id,
    contents=contents,
    config=config,
):
    _chunks.append(chunk)
    if chunk.candidates and chunk.candidates[0].content and chunk.candidates[0].content.parts:
        for part in chunk.candidates[0].content.parts:
            if not part.text:
                continue
            if part.thought:
                print(f"[Thinking] {part.text}", end="", flush=True)
            else:
                print(part.text, end="", flush=True)

_thought_detected = False
for _chunk in _chunks:
    if not _chunk.candidates or not _chunk.candidates[0].content:
        continue
    for _part in _chunk.candidates[0].content.parts:
        if not _part.text:
            continue
        if _part.thought:
            _thought_detected = True
            print(_part.text, end="", flush=True)
        else:
            print(_part.text, end="", flush=True)

if not _thought_detected:
    _usage = getattr(_chunks[-1], "usage_metadata", None) if _chunks else None
    if _usage and getattr(_usage, "thoughts_token_count", 0):
        _thought_detected = True

if not _thought_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no thinking information in GenAI stream")
print("\nVALIDATION: reasoning stream SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — reasoning:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmptuxvvdr5/snippet.py", line 35, in <module>
    response = client.models.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6331, in generate_content
    response = self._generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4767, in _generate_content
    response = self._api_client.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1605, in request
    response = self._request(http_request, http_options, stream=False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="How to calculate 3^3^3^3? Think step by step and show all reasoning.")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant. You MUST think step by step and show your reasoning. Never skip reasoning steps.",
    thinking_config=types.ThinkingConfig(
        include_thoughts=True,
        thinking_budget=5000,
    ),
)

response = client.models.generate_content(
    model=_model_id,
    contents=contents,
    config=config,
)

for part in response.candidates[0].content.parts:
    if not part.text:
        continue
    if part.thought:
        print(f"[Thinking] {part.text}")
    else:
        print(part.text)

_parts = response.candidates[0].content.parts
_thought_detected = False

for _part in _parts:
    if not _part.text:
        continue
    if _part.thought:
        _thought_detected = True
        print(f"Thinking: {_part.text[:200]}...")
    else:
        print(_part.text)

_usage = getattr(response, "usage_metadata", None)
if _usage and getattr(_usage, "thoughts_token_count", 0):
    _thought_detected = True

if not _thought_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no thinking information in GenAI response")
print("VALIDATION: reasoning SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — params:stream:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpz5ecwe_w/snippet.py", line 34, in <module>
    for chunk in client.models.generate_content_stream(
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6509, in generate_content_stream
    for chunk in response:
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4875, in _generate_content_stream
    for response in self._api_client.request_streamed(
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1622, in request_streamed
    session_response = self._request(http_request, http_options, stream=True)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="What is the capital of France?")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant.",
    max_output_tokens=256,
    temperature=0.7,
)

_chunks = []
for chunk in client.models.generate_content_stream(
    model=_model_id,
    contents=contents,
    config=config,
):
    _chunks.append(chunk)
    if chunk.text:
        print(chunk.text, end="", flush=True)

google-vertex/google/gemini-3.1-flash-lite — structured-output:stream:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmprhk_c90f/snippet.py", line 47, in <module>
    for chunk in client.models.generate_content_stream(
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6509, in generate_content_stream
    for chunk in response:
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4875, in _generate_content_stream
    for response in self._api_client.request_streamed(
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1622, in request_streamed
    session_response = self._request(http_request, http_options, stream=True)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

response_schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "date": {"type": "string"},
        "participants": {
            "type": "array",
            "items": {"type": "string"},
        },
    },
    "required": ["name", "date", "participants"],
}

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="Alice and Bob are going to a science fair on Friday.")]),
]

config = types.GenerateContentConfig(
    system_instruction="Extract the event information as a structured CalendarEvent JSON object.",
    response_mime_type="application/json",
    response_json_schema=response_schema,
)

_chunks = []
for chunk in client.models.generate_content_stream(
    model=_model_id,
    contents=contents,
    config=config,
):
    _chunks.append(chunk)
    if chunk.text:
        print(chunk.text, end="", flush=True)

import json as _json

_accumulated = ""
for _chunk in _chunks:
    if _chunk.text:
        _accumulated += _chunk.text

if not _accumulated:
    raise Exception("VALIDATION FAILED: structured-output stream - no content received from GenAI stream")

_parsed = _json.loads(_accumulated)
print(_json.dumps(_parsed, indent=2))

if "name" not in _parsed or "date" not in _parsed or "participants" not in _parsed:
    raise Exception("VALIDATION FAILED: structured-output stream - missing expected fields (name, date, participants)")

if not isinstance(_parsed.get("participants"), list):
    raise Exception("VALIDATION FAILED: structured-output stream - 'participants' is not a list, schema not enforced")

print("\nVALIDATION: structured-output stream SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — json-output:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp5cz_24cc/snippet.py", line 32, in <module>
    response = client.models.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6331, in generate_content
    response = self._generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4767, in _generate_content
    response = self._api_client.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1605, in request
    response = self._request(http_request, http_options, stream=False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="List 3 colors with their hex codes in JSON.")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant. Respond in JSON format.",
    response_mime_type="application/json",
)

response = client.models.generate_content(
    model=_model_id,
    contents=contents,
    config=config,
)

print(response.text)

import json as _json

_text = response.text
print(_text)

if not _text:
    raise Exception("VALIDATION FAILED: json-output - GenAI response text is empty")

_json.loads(_text)
print("VALIDATION: json-output SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — params:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp86idxugi/snippet.py", line 33, in <module>
    response = client.models.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6331, in generate_content
    response = self._generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4767, in _generate_content
    response = self._api_client.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1605, in request
    response = self._request(http_request, http_options, stream=False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="What is the capital of France?")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant.",
    max_output_tokens=256,
    temperature=0.7,
)

response = client.models.generate_content(
    model=_model_id,
    contents=contents,
    config=config,
)

for part in response.candidates[0].content.parts:
    if part.text:
        print(part.text)

google-vertex/google/gemini-3.1-flash-lite — json-output:stream:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp91nhgd2e/snippet.py", line 33, in <module>
    for chunk in client.models.generate_content_stream(
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6509, in generate_content_stream
    for chunk in response:
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4875, in _generate_content_stream
    for response in self._api_client.request_streamed(
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1622, in request_streamed
    session_response = self._request(http_request, http_options, stream=True)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="List 3 colors with their hex codes in JSON.")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant. Respond in JSON format.",
    response_mime_type="application/json",
)

_chunks = []
for chunk in client.models.generate_content_stream(
    model=_model_id,
    contents=contents,
    config=config,
):
    _chunks.append(chunk)
    if chunk.text:
        print(chunk.text, end="", flush=True)

import json as _json

_accumulated = ""
for _chunk in _chunks:
    if _chunk.text:
        _accumulated += _chunk.text

if not _accumulated:
    raise Exception("VALIDATION FAILED: json-output stream - no content received from GenAI stream")

_json.loads(_accumulated)
print("\nVALIDATION: json-output stream SUCCESS")

google-vertex/google/gemini-3.1-flash-lite — tool-call:stream:google-genai (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpcn90ep7s/snippet.py", line 50, in <module>
    for chunk in client.models.generate_content_stream(
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 6478, in generate_content_stream
    yield from self._generate_content_stream(
  File "/usr/local/lib/python3.11/site-packages/google/genai/models.py", line 4875, in _generate_content_stream
    for response in self._api_client.request_streamed(
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1622, in request_streamed
    session_response = self._request(http_request, http_options, stream=True)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1398, in _request
    return self._retry(self._request_once, http_request, stream)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/genai/_api_client.py", line 1375, in _request_once
    errors.APIError.raise_for_response(response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 155, in raise_for_response
    cls.raise_error(response.status_code, response_json, response)
  File "/usr/local/lib/python3.11/site-packages/google/genai/errors.py", line 184, in raise_error
    raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 403 failure. {'status': 'failure', 'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'error': {'message': 'User gateway-tester-v2-d4ca6165-3 is not authorized to access model test-v2-vertex/gemini-3.1-flash-lite or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from google import genai
from google.genai import types

_endpoint = "https://internal.devtest.truefoundry.tech/api/llm"
_api_key = "***"
_full_model = "test-v2-vertex/google/gemini-3.1-flash-lite"

_parts = _full_model.split("/")
_provider_account = _parts[0]
_model_id = "/".join(_parts[1:])
if "/" in _model_id:
    _model_id = _model_id.rsplit("/", 1)[-1]

_base_url = f"{_endpoint}/gemini/{_provider_account}/proxy"

client = genai.Client(
    api_key=_api_key,
    http_options=types.HttpOptions(base_url=_base_url),
)

get_weather = types.FunctionDeclaration(
    name="get_weather",
    description="Get the current weather for a location.",
    parameters_json_schema={
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The city name, e.g. London",
            },
        },
        "required": ["location"],
    },
)

tool = types.Tool(function_declarations=[get_weather])

contents = [
    types.Content(role="user", parts=[types.Part.from_text(text="Hi")]),
    types.Content(role="model", parts=[types.Part.from_text(text="Hi, how can I help you")]),
    types.Content(role="user", parts=[types.Part.from_text(text="Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text.")]),
]

config = types.GenerateContentConfig(
    system_instruction="You are a helpful assistant with access to tools. You MUST strictly use the provided tools to answer. Never respond with plain text when a tool is available.",
    tools=[tool],
)

_chunks = []
for chunk in client.models.generate_content_stream(
    model=_model_id,
    contents=contents,
    config=config,
):
    _chunks.append(chunk)
    if chunk.candidates and chunk.candidates[0].content and chunk.candidates[0].content.parts:
        for part in chunk.candidates[0].content.parts:
            if part.function_call:
                print(f"Tool: {part.function_call.name}", flush=True)
                print(f"Args: {part.function_call.args}", flush=True)
            elif part.text:
                print(part.text, end="", flush=True)

_tool_use_detected = False
for _chunk in _chunks:
    if not _chunk.candidates or not _chunk.candidates[0].content:
        continue
    for _part in _chunk.candidates[0].content.parts:
        if _part.function_call:
            _tool_use_detected = True
            print(f"Tool: {_part.function_call.name}", flush=True)
            print(f"Args: {_part.function_call.args}", flush=True)
        elif _part.text:
            print(_part.text, end="", flush=True)

if not _tool_use_detected:
    raise Exception("VALIDATION FAILED: tool-call stream - no function calls in GenAI stream")
print("\nVALIDATION: tool-call stream SUCCESS")

Truefoundry Models Bot added 2 commits May 8, 2026 00:04

feat(google-vertex): add new models [bot]

542efd0

feat(google-vertex): update model YAMLs [bot]

877b9a7

cursor Bot reviewed May 8, 2026

View reviewed changes

Comment thread providers/google-vertex/google/gemini-3.1-flash-lite.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(google-vertex): add new models [bot]#977

feat(google-vertex): add new models [bot]#977
models-bot[bot] wants to merge 2 commits intomainfrom
bot/add-google-vertex-20260508-000417

models-bot Bot commented May 8, 2026 •

edited by cursor Bot

Loading

Uh oh!

github-actions Bot commented May 8, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

harshiv-26 commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

models-bot Bot commented May 8, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 8, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

harshiv-26 commented May 8, 2026

Gateway test results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

models-bot Bot commented May 8, 2026 •

edited by cursor Bot

Loading