Skip to content

chore(pricing): Update vertex-ai pricing#641

Open
siddharthsambharia-portkey wants to merge 2 commits intomainfrom
pricing-update/vertex-ai-24242241947
Open

chore(pricing): Update vertex-ai pricing#641
siddharthsambharia-portkey wants to merge 2 commits intomainfrom
pricing-update/vertex-ai-24242241947

Conversation

@siddharthsambharia-portkey
Copy link
Copy Markdown
Collaborator

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type Count
➕ Models added 11
🔄 Models updated (merged) 27

➕ New Models

  • gemini-2.0-flash-image-generation
  • veo-3.1-lite-generate-001
  • deepseek-ocr-maas
  • gpt-oss-20b
  • llama-4-scout-17b-16e-instruct-maas
  • gemini-2.5-pro-preview-06-05
  • gemini-2.5-pro-exp-03-25
  • gemini-2.0-flash-thinking-exp-01-21
  • gemini-2.0-pro-exp-02-05
  • imagen-4.0-ultra-generate-preview-06-02
  • textembedding-gecko-multilingual@002

🔄 Updated Models

  • gemini-3.1-pro-preview
  • gemini-3.1-flash-image-preview
  • gemini-3.1-flash-lite-preview
  • gemini-3-pro-image-preview
  • gemini-3-flash-preview
  • gemini-2.5-pro
  • veo-3.1-generate-001
  • veo-3.1-fast-generate-001
  • veo-3.0-generate-001
  • text-embedding-005
  • text-multilingual-embedding-002
  • textembedding-gecko@003
  • textembedding-gecko-multilingual@001
  • text-embedding-large-exp-03-07
  • gemini-2.5-pro-preview-05-06
  • gemini-2.0-flash-lite
  • gemini-2.5-flash-preview-04-17
  • gemini-2.5-flash-preview-05-20
  • gemini-2.5-flash-lite-preview-06-17
  • gemini-2.0-flash-exp
  • gemini-1.5-pro-001
  • gemini-1.5-pro-002
  • gemini-1.5-flash-001
  • gemini-1.5-flash-002
  • gemini-1.0-pro-001
  • gemini-1.0-pro-002
  • textembedding-gecko@001

Model-to-Pricing-Page Mapping

Google – Gemini (text/multimodal)

Model ID Publisher / Section Source Notes
gemini-3.1-pro-preview Google – Gemini 3.1 Pro API Standard pricing ≤200K tokens
gemini-3.1-flash-image-preview Google – Gemini 3.1 Flash Image API image_token $60/1M
gemini-3.1-flash-lite-preview Google – Gemini 3.1 Flash Lite API
gemini-3-pro-preview Google – Gemini 3 Pro API Standard pricing ≤200K tokens
gemini-3-pro-image-preview Google – Gemini 3 Pro Image API image_token $120/1M
gemini-3-flash-preview Google – Gemini 3 Flash API
gemini-2.5-pro Google – Gemini 2.5 Pro API Standard pricing ≤200K tokens
gemini-2.5-computer-use-preview-10-2025 Google – Gemini 2.5 Computer Use API Same pricing as 2.5 Pro; no cache/batch listed
gemini-2.5-flash Google – Gemini 2.5 Flash API
gemini-2.5-flash-image Google – Gemini 2.5 Flash Image API image_token $30/1M
gemini-2.5-flash-lite Google – Gemini 2.5 Flash Lite API
gemini-2.5-flash-preview-09-2025 Google – Gemini 2.5 Flash API Preview alias; same pricing as gemini-2.5-flash
gemini-2.5-flash-lite-preview-09-2025 Google – Gemini 2.5 Flash Lite API Preview alias; same pricing as gemini-2.5-flash-lite
gemini-2.5-pro-preview-05-06 Google – Gemini 2.5 Pro API Preview alias; same pricing as gemini-2.5-pro
gemini-2.5-pro-preview-06-05 Google – Gemini 2.5 Pro API Preview alias; same pricing as gemini-2.5-pro
gemini-2.5-flash-preview-04-17 Google – Gemini 2.5 Flash API Preview alias; same pricing as gemini-2.5-flash
gemini-2.5-flash-preview-05-20 Google – Gemini 2.5 Flash API Preview alias; same pricing as gemini-2.5-flash
gemini-2.5-flash-lite-preview-06-17 Google – Gemini 2.5 Flash Lite API Preview alias; same pricing as gemini-2.5-flash-lite
gemini-2.5-pro-exp-03-25 Google – Gemini 2.5 Pro API Exp alias; same pricing as gemini-2.5-pro
gemini-2.0-flash-001 Google – Gemini 2.0 Flash API
gemini-2.0-flash Google – Gemini 2.0 Flash API Alias for gemini-2.0-flash-001
gemini-2.0-flash-exp Google – Gemini 2.0 Flash API Exp alias; same pricing as gemini-2.0-flash-001
gemini-2.0-flash-lite-001 Google – Gemini 2.0 Flash Lite API
gemini-2.0-flash-lite Google – Gemini 2.0 Flash Lite API Alias for gemini-2.0-flash-lite-001
gemini-2.0-flash-image-generation Google – Gemini 2.0 Flash Image API image_token $30/1M
gemini-2.0-flash-thinking-exp-01-21 Google – Gemini 2.0 Flash Thinking API – price not found Experimental thinking; no pricing row found
gemini-2.0-pro-exp-02-05 Google – Gemini 2.0 Pro Exp API – price not found Experimental; no pricing row found
gemini-1.5-pro-001 Google – Gemini 1.5 Pro API – price not found Legacy; not on current pricing page
gemini-1.5-pro-002 Google – Gemini 1.5 Pro API – price not found Legacy; not on current pricing page
gemini-1.5-flash-001 Google – Gemini 1.5 Flash API – price not found Legacy; not on current pricing page
gemini-1.5-flash-002 Google – Gemini 1.5 Flash API – price not found Legacy; not on current pricing page
gemini-1.0-pro-001 Google – Gemini 1.0 Pro API – price not found Legacy; not on current pricing page
gemini-1.0-pro-002 Google – Gemini 1.0 Pro API – price not found Legacy; not on current pricing page

Google – Imagen (per-image)

Model ID Publisher / Section Source Notes
imagen-4.0-ultra-generate-001 Google – Imagen 4 Ultra API $0.06/image
imagen-4.0-ultra-generate-preview-06-02 Google – Imagen 4 Ultra API Preview alias; $0.06/image
imagen-4.0-generate-001 Google – Imagen 4 API $0.04/image
imagen-4.0-fast-generate-001 Google – Imagen 4 Fast API $0.02/image
imagen-3.0-generate-001 Google – Imagen 3 API $0.04/image
imagen-3.0-generate-002 Google – Imagen 3 API $0.04/image
imagen-3.0-fast-generate-001 Google – Imagen 3 Fast API $0.02/image
imagen-3.0-capability-001 Google – Imagen 3 (capability) API Shares pricing with imagen-3.0-generate; $0.04/image
imagen-3.0-capability-002 Google – Imagen 3 (capability) API Shares pricing with imagen-3.0-generate; $0.04/image

Google – Veo (per-second video)

Model ID Publisher / Section Source Notes
veo-3.1-generate-001 Google – Veo 3.1 API $0.40/sec (720p/1080p video+audio)
veo-3.1-fast-generate-001 Google – Veo 3.1 Fast API $0.10/sec (720p video+audio)
veo-3.1-lite-generate-001 Google – Veo 3.1 Lite API $0.05/sec (720p video+audio)
veo-3.0-generate-001 Google – Veo 3 API $0.40/sec (video+audio)
veo-3.0-fast-generate-001 Google – Veo 3 Fast API $0.10/sec (720p video+audio)
veo-2.0-generate-001 Google – Veo 2 API $0.50/sec

Google – Embeddings

Model ID Publisher / Section Source Notes
gemini-embedding-001 Google – Gemini Embedding API $0.00015/1K tokens
gemini-embedding-2-preview Google – Gemini Embedding 2 API $0.20/1M tokens text input
text-embedding-005 Google – Text Embedding API $0.000025/1K chars
text-multilingual-embedding-002 Google – Text Multilingual Embedding API $0.000025/1K chars
textembedding-gecko@003 Google – Textembedding Gecko API $0.000025/1K chars (legacy)
textembedding-gecko@001 Google – Textembedding Gecko API $0.000025/1K chars (legacy)
textembedding-gecko-multilingual@001 Google – Textembedding Gecko Multilingual API $0.000025/1K chars
textembedding-gecko-multilingual@002 Google – Textembedding Gecko Multilingual API $0.000025/1K chars
text-embedding-large-exp-03-07 Google – Text Embedding Large (exp) API No dedicated row; uses text-embedding-005 rate
multimodalembedding@001 Google – Multimodal Embedding API Per-image/video additional pricing

Anthropic – Claude

Model ID Publisher / Section Source Notes
claude-opus-4-6 Anthropic – Claude API @default suffix stripped; $5/$25
claude-sonnet-4-6 Anthropic – Claude API @default suffix stripped; $3/$15
claude-opus-4-1@20250805 Anthropic – Claude API Pinned version; $15/$75
claude-sonnet-4-5@20250929 Anthropic – Claude API Pinned version; $3/$15
claude-haiku-4-5@20251001 Anthropic – Claude API Pinned version; $1/$5
claude-opus-4-5@20251101 Anthropic – Claude API Pinned version; $5/$25
claude-opus-4@20250514 Anthropic – Claude API Pinned version; $15/$75
claude-sonnet-4@20250514 Anthropic – Claude API Pinned version; $3/$15

OpenAI

Model ID Publisher / Section Source Notes
gpt-oss-120b-maas OpenAI API $0.09/$0.36; batch included
gpt-oss-20b OpenAI API $0.07/$0.25; cache hit + batch included

Meta – Llama

Model ID Publisher / Section Source Notes
llama-3.1-405b-instruct-maas Meta – Llama API $5.00/$16.00
llama-3.3-70b-instruct-maas Meta – Llama API $0.72/$0.72; batch included
llama-4-scout-17b-16e-instruct-maas Meta – Llama API $0.25/$0.70; batch included
llama-4-maverick-17b-128e-instruct-maas Meta – Llama API $0.35/$1.15; batch included

Mistral AI

Model ID Publisher / Section Source Notes
mistral-small-2503 Mistral AI API $0.10/$0.30
mistral-medium-3 Mistral AI API $0.40/$2.00
codestral-2 Mistral AI API $0.30/$0.90

DeepSeek

Model ID Publisher / Section Source Notes
deepseek-r1-0528-maas DeepSeek API $1.35/$5.40; batch included
deepseek-v3.1-maas DeepSeek API $0.60/$1.70; cache hit + batch included
deepseek-v3.2-maas DeepSeek API $0.56/$1.68; cache hit + batch included
deepseek-ocr-maas DeepSeek API OCR model but is MaaS; $0.30/$1.20

Qwen

Model ID Publisher / Section Source Notes
qwen3-235b-a22b-instruct-2507-maas Qwen API $0.22/$0.88; batch included
qwen3-coder-480b-a35b-instruct-maas Qwen API $0.22/$1.80; cache hit + batch included
qwen3-next-80b-a3b-instruct-maas Qwen API $0.15/$1.20
qwen3-next-80b-a3b-thinking-maas Qwen API $0.15/$1.20

MiniMax

Model ID Publisher / Section Source Notes
minimax-m2-maas MiniMax API $0.30/$1.20; cache hit included

Moonshot / Kimi

Model ID Publisher / Section Source Notes
kimi-k2-thinking-maas Moonshot / Kimi API $0.60/$2.50; cache hit included

ZAI.org / GLM

Model ID Publisher / Section Source Notes
glm-4.7-maas ZAI.org / GLM API $0.60/$2.20
glm-5-maas ZAI.org / GLM API $1.00/$3.20; cache hit included

Excluded Models (not added to pricing)

Model Publisher Reason
*-live-* models Google Gemini Live streaming — separate product
lyria-* Google Music generation — no inference endpoint
model-optimizer-* Google Dynamic routing meta-endpoint
imagegeneration Google Legacy; superseded by Imagen 3+
virtual-try-on-* Google Retail-specific model
Gemma models Google Non-generative / self-deploy only
chirp-2, chirp-3 Google Audio transcription (non-generative)
owlvit-*, bart-*, bert-* etc. Google Non-generative ML / CV
jamba-large-1.6 AI21 Self-deploy only (has_deploy: true, no -maas)
llama-guard-*, prompt-guard-* Meta Safety/guard models
llama-* (non-maas) Meta Self-deploy only
mistral-ocr-2505 Mistral OCR model
codestral-2501-self-deploy Mistral Self-deploy only
deepseek-* (non-maas) DeepSeek Self-deploy only
qwen-image Qwen Explicit exception — excluded from Vertex AI pricing
qwen-* (non-maas) Qwen Self-deploy only
glm-image ZAI.org Explicit exception — excluded from Vertex AI pricing
glm-ocr-* ZAI.org OCR models
glm-* (non-maas) ZAI.org Self-deploy only
minimax-* (non-maas) MiniMax Self-deploy only
kimi-* (non-maas) Moonshot Self-deploy only
whisper-* OpenAI Audio transcription (non-generative)
gpt-oss-* (non-maas non-listed) OpenAI Self-deploy only

Generated by Pricing Agent on 2026-04-10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant