Skip to content

chore(pricing): Update vertex-ai pricing#628

Open
siddharthsambharia-portkey wants to merge 2 commits intomainfrom
pricing-update/vertex-ai-24219607864
Open

chore(pricing): Update vertex-ai pricing#628
siddharthsambharia-portkey wants to merge 2 commits intomainfrom
pricing-update/vertex-ai-24219607864

Conversation

@siddharthsambharia-portkey
Copy link
Copy Markdown
Collaborator

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type Count
➕ Models added 44
🔄 Models updated (merged) 14

➕ New Models

  • gemini-2.5-pro-tts
  • gemini-2.5-flash-tts
  • gemini-2.5-pro-computer-use-preview
  • gemini-2.0-flash-image-generation
  • veo-3.1-lite-generate-001
  • gpt-oss-20b-maas
  • llama-4-scout-17b-16e-instruct-maas
  • llama-3.1-70b-instruct-maas
  • llama-3.1-8b-instruct-maas
  • llama-3-8b-instruct-maas
  • llama-3-70b-instruct-maas
  • llama-guard-3-8b-it-maas
  • llama-guard-3-1b-it-maas
  • mistral-large-2411
  • mistral-nemo-2407
  • mistral-large-2407
  • deepseek-v3-1-maas
  • deepseek-v3-2-maas
  • deepseek-r1-maas
  • deepseek-r1-zero-maas
  • ... and 24 more

🔄 Updated Models

  • gemini-2.5-pro
  • gemini-2.5-flash-lite-preview-06-17
  • gemini-2.0-flash-lite
  • gemini-3-pro-image-preview
  • gemini-3-flash-preview
  • gemini-3.1-pro-preview
  • gemini-3.1-flash-image-preview
  • gemini-3.1-flash-lite-preview
  • veo-3.1-generate-001
  • veo-3.0-generate-001
  • veo-3.0-fast-generate-001
  • gemini-embedding-2-preview
  • textembedding-gecko-multilingual@001
  • multimodalembedding@001

Model-to-Pricing-Page Mapping

Model ID Publisher / Section Source Notes
gemini-2.5-pro Google – Gemini 2.5 API Standard ≤200K price; long-context >200K: $2.50 in/$15 out
gemini-2.5-flash Google – Gemini 2.5 API Audio input $1.00/1M
gemini-2.5-flash-preview-09-2025 Google – Gemini 2.5 API Preview alias for gemini-2.5-flash
gemini-2.5-flash-lite Google – Gemini 2.5 API Audio input $0.30/1M
gemini-2.5-flash-lite-preview-06-17 Google – Gemini 2.5 API Preview alias for gemini-2.5-flash-lite
gemini-2.5-flash-image Google – Gemini 2.5 API Image output $30/1M tokens (batch $15/1M)
gemini-2.5-pro-tts Google – Gemini 2.5 API TTS model; priced same as gemini-2.5-pro
gemini-2.5-flash-tts Google – Gemini 2.5 API TTS model; priced same as gemini-2.5-flash
gemini-2.5-pro-computer-use-preview Google – Gemini 2.5 API Computer Use preview; priced as gemini-2.5-pro
gemini-2.0-flash-001 Google – Gemini 2.0 API Canonical 2.0 ID with -001; audio input $1.00/1M, video $3.00/1M
gemini-2.0-flash Google – Gemini 2.0 API Alias for gemini-2.0-flash-001
gemini-2.0-flash-lite-001 Google – Gemini 2.0 API Canonical 2.0 lite ID with -001
gemini-2.0-flash-lite Google – Gemini 2.0 API Alias for gemini-2.0-flash-lite-001
gemini-2.0-flash-image-generation Google – Gemini 2.0 API Image output $30/1M tokens; audio $1.00/1M, video $3.00/1M
gemini-3-pro-preview Google – Gemini 3 API Long-context >200K: $4 in/$18 out
gemini-3-pro-image-preview Google – Gemini 3 API Image output $120/1M tokens (batch $60/1M)
gemini-3-flash-preview Google – Gemini 3 API Cache read $0.05/1M; web/maps search $14/1000
gemini-3.1-pro-preview Google – Gemini 3.1 API Long-context >200K: $4 in/$18 out
gemini-3.1-flash-image-preview Google – Gemini 3.1 API Image output $60/1M tokens (batch $30/1M)
gemini-3.1-flash-lite-preview Google – Gemini 3.1 API Cache read $0.03/1M
imagen-4.0-generate-001 Google – Imagen 4 API $0.04/image
imagen-4.0-fast-generate-001 Google – Imagen 4 Fast API $0.02/image
imagen-4.0-ultra-generate-001 Google – Imagen 4 Ultra API $0.06/image
imagen-3.0-generate-002 Google – Imagen 3 API $0.04/image
imagen-3.0-generate-001 Google – Imagen 3 API $0.04/image (older version)
imagen-3.0-capability-001 Google – Imagen 3 Capability API Maps to imagen-3.0-generate pricing; $0.04/image
imagen-3.0-capability-002 Google – Imagen 3 Capability API Maps to imagen-3.0-generate pricing; $0.04/image
veo-3.1-generate-001 Google – Veo 3.1 API $0.40/sec (720p/1080p w/audio); video_seconds=40 cents
veo-3.1-fast-generate-001 Google – Veo 3.1 Fast API $0.15/sec (720p/1080p w/audio); video_seconds=15 cents
veo-3.1-lite-generate-001 Google – Veo 3.1 Lite API $0.08/sec (1080p w/audio); video_seconds=8 cents
veo-3.0-generate-001 Google – Veo 3 API $0.40/sec w/audio; video_seconds=40 cents
veo-3.0-fast-generate-001 Google – Veo 3 Fast API $0.15/sec w/audio; video_seconds=15 cents
veo-2.0-generate-001 Google – Veo 2 API $0.50/sec; video_seconds=50 cents
gemini-embedding-001 Google – Gemini Embedding API $0.00015/1K tokens (online); batch $0.00012/1K
gemini-embedding-2-preview Google – Gemini Embedding API Preview; priced same as gemini-embedding-001
text-embedding-005 Google – Text Embedding API $0.000025/1K chars; batch $0.00002/1K
text-embedding-large-exp-03-07 Google – Text Embedding API Experimental; priced same as text-embedding-005
text-multilingual-embedding-002 Google – Text Embedding Multilingual API $0.000025/1K chars; batch $0.00002/1K
textembedding-gecko@003 Google – Text Embedding (legacy) API – price not found Legacy; using text-embedding-005 rate
textembedding-gecko-multilingual@001 Google – Text Embedding (legacy) API – price not found Legacy multilingual; using text-embedding rate
multimodalembedding@001 Google – Multimodal Embedding API Per-image $0.002, video plus $0.002, standard $0.001, essential $0.0002 (cents)
claude-opus-4-6 Anthropic – Claude API @default stripped; $5 in/$25 out; 5m cache write $6.25
claude-sonnet-4-6 Anthropic – Claude API @default stripped; $3 in/$15 out; 5m cache write $3.75
claude-sonnet-4-5@20250929 Anthropic – Claude API Pinned date kept; standard ≤200K pricing
claude-sonnet-4@20250514 Anthropic – Claude API Pinned date kept; $3 in/$15 out
claude-opus-4-5@20251101 Anthropic – Claude API Pinned date kept; $5 in/$25 out
claude-opus-4-1@20250805 Anthropic – Claude API Pinned date kept; $15 in/$75 out
claude-opus-4@20250514 Anthropic – Claude API Pinned date kept; $15 in/$75 out
claude-haiku-4-5@20251001 Anthropic – Claude API Pinned date kept; $1 in/$5 out
gpt-oss-120b-maas OpenAI API $0.09 in/$0.36 out; batch $0.045/$0.18
gpt-oss-20b-maas OpenAI API – price not found No pricing row found; added with price 0
llama-4-maverick-17b-128e-instruct-maas Meta – Llama 4 API Llama 4 Maverick; $0.35 in/$1.15 out
llama-4-scout-17b-16e-instruct-maas Meta – Llama 4 API Llama 4 Scout; $0.25 in/$0.70 out
llama-3.3-70b-instruct-maas Meta – Llama 3.3 API $0.72 in/$0.72 out
llama-3.1-405b-instruct-maas Meta – Llama 3.1 API $5.00 in/$16.00 out
llama-3.1-70b-instruct-maas Meta – Llama 3.1 API – price not found No pricing row; added with price 0
llama-3.1-8b-instruct-maas Meta – Llama 3.1 API – price not found No pricing row; added with price 0
llama-3-8b-instruct-maas Meta – Llama 3 API – price not found No pricing row; added with price 0
llama-3-70b-instruct-maas Meta – Llama 3 API – price not found No pricing row; added with price 0
llama-guard-3-8b-it-maas Meta – Llama Guard API – price not found Guard model; included per policy; price 0
llama-guard-3-1b-it-maas Meta – Llama Guard API – price not found Guard model; included per policy; price 0
mistral-medium-3 Mistral API $0.40 in/$2.00 out
mistral-small-2503 Mistral API Mistral Small 3.1; $0.10 in/$0.30 out
codestral-2 Mistral API Codestral 2; $0.30 in/$0.90 out
mistral-large-2411 Mistral API – price not found No pricing row; added with price 0
mistral-nemo-2407 Mistral API – price not found No pricing row; added with price 0
mistral-large-2407 Mistral API – price not found No pricing row; added with price 0
deepseek-v3-1-maas DeepSeek API DeepSeek-V3.1; $0.60 in/$1.70 out; cache hit $0.06
deepseek-v3-2-maas DeepSeek API DeepSeek-V3.2; $0.56 in/$1.68 out; cache hit $0.056
deepseek-r1-0528-maas DeepSeek API DeepSeek-R1-0528; $1.35 in/$5.40 out
deepseek-r1-maas DeepSeek API – price not found No pricing row; added with price 0
deepseek-r1-zero-maas DeepSeek API – price not found No pricing row; added with price 0
deepseek-v2-5-maas DeepSeek API – price not found No pricing row; added with price 0
deepseek-v3-maas DeepSeek API – price not found No pricing row; added with price 0
deepseek-v2-lite-chat-maas DeepSeek API – price not found No pricing row; added with price 0
deepseek-v2-chat-maas DeepSeek API – price not found No pricing row; added with price 0
deepseek-coder-v2-instruct-maas DeepSeek API – price not found No pricing row; added with price 0
qwen3-next-80b-a3b-thinking-maas Qwen API Qwen3-Next-80B Thinking; $0.15 in/$1.20 out
qwen3-next-80b-a3b-instruct-maas Qwen API Qwen3-Next-80B Instruct; $0.15 in/$1.20 out
qwen3-coder-480b-a35b-instruct-maas Qwen API Qwen3-Coder-480B; $0.22 in/$1.80 out; cache hit $0.022
qwen3-235b-a22b-instruct-2507-maas Qwen API Qwen3-235B-A22B-2507; $0.22 in/$0.88 out
qwen2-5-72b-instruct-maas Qwen API – price not found No pricing row; added with price 0
qwen2-5-coder-32b-instruct-maas Qwen API – price not found No pricing row; added with price 0
qwen2-5-vl-72b-instruct-maas Qwen API – price not found No pricing row; added with price 0
qwen2-72b-instruct-maas Qwen API – price not found No pricing row; added with price 0
qwen2-5-14b-instruct-maas Qwen API – price not found No pricing row; added with price 0
qwq-32b-maas Qwen API – price not found No pricing row; added with price 0
qwen2-5-coder-7b-instruct-maas Qwen API – price not found No pricing row; added with price 0
qvq-72b-preview-maas Qwen API – price not found No pricing row; added with price 0
qwen2-5-3b-instruct-maas Qwen API – price not found No pricing row; added with price 0
minimax-m2-maas MiniMax API MiniMax-M2; $0.30 in/$1.20 out; cache hit $0.03
minimax-text-01-maas MiniMax API – price not found No pricing row; added with price 0
kimi-k2-thinking-maas Moonshot/Kimi API Kimi-K2-Thinking; $0.60 in/$2.50 out; cache hit $0.06
moonshot-v1-8k-maas Moonshot/Kimi API – price not found No pricing row; added with price 0
moonshot-v1-32k-maas Moonshot/Kimi API – price not found No pricing row; added with price 0
glm-4-7-maas ZAI.org/GLM API GLM-4.7; $0.60 in/$2.20 out
glm-5-maas ZAI.org/GLM API GLM-5; $1.00 in/$3.20 out; cache hit $0.10
glm-4-plus-maas ZAI.org/GLM API – price not found No pricing row; added with price 0
glm-4-0520-maas ZAI.org/GLM API – price not found No pricing row; added with price 0
glm-4-long-maas ZAI.org/GLM API – price not found No pricing row; added with price 0
glm-4-airx-maas ZAI.org/GLM API – price not found No pricing row; added with price 0
jamba-1-5-mini-maas AI21 API Jamba 1.5 Mini; $0.20 in/$0.40 out
jamba-1-5-large-maas AI21 API Jamba 1.5 Large; $2.00 in/$8.00 out

Excluded Models (with reasons)

Model Publisher Reason
gemini-*-live-* Google Gemini Live streaming — separate product
lyria-* Google Music generation — no inference endpoint
model-optimizer-* Google Dynamic routing meta-endpoint
imagegeneration Google Legacy, superseded by imagen-3.0+
virtual-try-on-* Google Product-specific retail model
shieldgemma2 Google Safety/guard model
gemma-*, codey-*, palm-* Google Non-generative or traditional ML
chirp-* Google Audio transcription — not generative inference
t5gemma-* Google Non-generative NLP
whisper-* OpenAI Audio transcription — not generative inference
clip-* OpenAI Non-generative vision classification
deepseek-ocr-maas DeepSeek OCR model — excluded per policy
glm-ocr-maas ZAI.org OCR model — excluded per policy
glm-image ZAI.org Image generation — explicit policy exclusion
qwen-image Qwen Image generation — explicit policy exclusion
sam3 Meta Non-generative segmentation model
mistral-ocr-* Mistral OCR model — excluded per policy
codestral-*-self-deploy Mistral Self-deploy (has_deploy: true, no -maas)

Publisher API Return Summary

Publisher API Call Models Returned Notes
google get_vertex_models(publisher: "google") 126 Many excluded (live, lyria, non-generative, etc.)
anthropic get_vertex_models(publisher: "anthropic") 8 All included
openai get_vertex_models(publisher: "openai") 5 whisper/clip excluded
meta get_vertex_models(publisher: "meta") 21 sam3, guard models excluded
ai21 get_vertex_models(publisher: "ai21") 1 → actually 2 pricing rows jamba-1-5-mini + jamba-1-5-large
qwen get_vertex_models(publisher: "qwen") 14 qwen-image excluded
mistral-ai get_vertex_models(publisher: "mistral-ai") 2 Both self-deploy, excluded
mistralai get_vertex_models(publisher: "mistralai") 7 Used instead of mistral-ai; ocr/self-deploy excluded
deepseek-ai get_vertex_models(publisher: "deepseek-ai") 10 ocr excluded
deepseek get_vertex_models(publisher: "deepseek") 0
moonshotai get_vertex_models(publisher: "moonshotai") 3 All included
minimaxai get_vertex_models(publisher: "minimaxai") 2 All included
zai-org get_vertex_models(publisher: "zai-org") 7 glm-image, glm-ocr excluded

Generated by Pricing Agent on 2026-04-10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant