Skip to content

feat(together-ai): add new models [bot]#980

Merged
harshiv-26 merged 6 commits intomainfrom
bot/add-together-ai-20260508-000534
May 8, 2026
Merged

feat(together-ai): add new models [bot]#980
harshiv-26 merged 6 commits intomainfrom
bot/add-together-ai-20260508-000534

Conversation

@models-bot
Copy link
Copy Markdown
Contributor

@models-bot models-bot Bot commented May 8, 2026

Auto-generated by model-addition-agent for provider together-ai.


Note

Low Risk
Low risk: adds two new Together AI model YAML entries and marks one as deprecated; no runtime code or security-sensitive logic changes.

Overview
Adds two new Together AI model definition YAMLs for provisioned chat models: google/gemma-3-270m-it-lora (32k context, system_messages, zeroed token costs) and mistralai/Mixtral-8x7B-Instruct-v0.1-FP8-Lora (32k context, token pricing).

The Mixtral FP8 LoRA entry is flagged as deprecated with a deprecationDate of 2026-04-16.

Reviewed by Cursor Bugbot for commit c4d6f60. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread providers/together-ai/google/gemma-3-270m-it-lora.yaml Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

/test-models

@harshiv-26
Copy link
Copy Markdown
Collaborator

Gateway test results

  • Total: 3
  • Passed: 0
  • Failed: 2
  • Validation failed: 0
  • Errored: 0
  • Skipped: 1
  • Success rate: 0.0%
Provider Model Scenarios
together-ai google/gemma-3-270m-it-lora failure: params, params:stream
together-ai mistralai/Mixtral-8x7B-Instruct-v0.1-FP8-Lora skipped: skip-check
Failures (2)

together-ai/google/gemma-3-270m-it-lora — params (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp5mq8rin4/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model google/gemma-3-270m-it-lora. Please visit https://api.together.ai/models/google/gemma-3-270m-it-lora to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model google/gemma-3-270m-it-lora. Please visit https://api.together.ai/models/google/gemma-3-270m-it-lora to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}
Code snippet
from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/google-gemma-3-270m-it-lora",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    stream=False,
)

print(response.choices[0].message.content)

together-ai/google/gemma-3-270m-it-lora — params:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpmfk58w14/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model google/gemma-3-270m-it-lora. Please visit https://api.together.ai/models/google/gemma-3-270m-it-lora to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model google/gemma-3-270m-it-lora. Please visit https://api.together.ai/models/google/gemma-3-270m-it-lora to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}
Code snippet
from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/google-gemma-3-270m-it-lora",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    stream=True,
)

for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
Skipped (1)

together-ai/mistralai/Mixtral-8x7B-Instruct-v0.1-FP8-Lora — skip-check (skipped)

Skip reason:

unsupported mode 'unknown'

@harshiv-26 harshiv-26 enabled auto-merge (squash) May 8, 2026 05:45
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

/test-models

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

/test-models

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 44214a8. Configure here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

/test-models

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

/test-models

@harshiv-26 harshiv-26 merged commit 7c7b8d3 into main May 8, 2026
8 checks passed
@harshiv-26 harshiv-26 deleted the bot/add-together-ai-20260508-000534 branch May 8, 2026 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant