Skip to content

chore(pricing): Update vertex-ai pricing#654

Open
siddharthsambharia-portkey wants to merge 2 commits intomainfrom
pricing-update/vertex-ai-24276476903
Open

chore(pricing): Update vertex-ai pricing#654
siddharthsambharia-portkey wants to merge 2 commits intomainfrom
pricing-update/vertex-ai-24276476903

Conversation

@siddharthsambharia-portkey
Copy link
Copy Markdown
Collaborator

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type Count
➕ Models added 70
🔄 Models updated (merged) 20

➕ New Models

  • gemini-2.5-pro-preview-06-05
  • gemini-2.0-flash-thinking-exp-01-21
  • gemini-2.0-flash-image-generation
  • gemini-1.5-flash-8b-001
  • gemini-3.1-pro
  • gemini-3.1-flash-image
  • gemini-3.1-flash-lite
  • gemini-3-pro
  • gemini-3-pro-image
  • gemini-3-flash
  • gemma-4-26b-a4b-it-maas
  • imagen-4.0-capability-001
  • imagen-2.0-generate-001
  • imagetext@001
  • veo-3.1-lite-generate-preview
  • veo-3-generate-001
  • veo-3-fast-generate-preview
  • veo-2-generate-001
  • gemini-embedding-exp-03-07
  • claude-opus-4-1
  • ... and 50 more

🔄 Updated Models

  • gemini-2.5-pro
  • gemini-2.5-flash-preview-05-20
  • gemini-2.5-pro-preview-05-06
  • gemini-2.5-flash-preview-04-17
  • gemini-2.0-flash-exp
  • gemini-1.5-pro-001
  • gemini-1.5-pro-002
  • gemini-1.5-flash-001
  • gemini-1.5-flash-002
  • gemini-1.0-pro-001
  • gemini-1.0-pro-002
  • gemini-1.0-pro-vision-001
  • veo-3.1-fast-generate-001
  • text-embedding-004
  • textembedding-gecko-multilingual@001
  • multimodalembedding@001
  • gemini-2.5-pro-preview-03-25
  • gemini-2.5-flash-lite-preview-06-17
  • gemini-2.0-flash-preview-image-generation
  • textembedding-gecko@001

Model-to-Pricing-Page Mapping

Google – Gemini (token pricing, $/1M)

Model ID Publisher / Section Source Notes
gemini-2.5-pro Google – Gemini 2.5 API Standard table: $1.25/$10.00, cache $0.13; Flex: $0.625/$5.00; Search: $35/1K=3.5¢, Enterprise: $45/1K=4.5¢
gemini-2.5-flash Google – Gemini 2.5 API Standard: $0.30/$2.50, cache $0.03; Flex: $0.15/$1.25; Search 3.5¢/4.5¢
gemini-2.5-flash-lite Google – Gemini 2.5 API Standard: $0.10/$0.40, cache $0.01; Flex: $0.05/$0.20; Search 3.5¢/4.5¢
gemini-2.5-flash-preview-05-20 Google – Gemini 2.5 API Preview alias → Gemini 2.5 Flash pricing
gemini-2.5-pro-preview-05-06 Google – Gemini 2.5 API Preview alias → Gemini 2.5 Pro pricing
gemini-2.5-pro-preview-06-05 Google – Gemini 2.5 API Preview alias → Gemini 2.5 Pro pricing
gemini-2.5-flash-preview-04-17 Google – Gemini 2.5 API Preview alias → Gemini 2.5 Flash pricing
gemini-2.5-flash-image Google – Gemini 2.5 API Flash Image variant; image_token $30/1M
gemini-2.5-pro-exp-03-25 Google – Gemini 2.5 API Exp alias → Gemini 2.5 Pro pricing
gemini-2.5-flash-exp-04-17 Google – Gemini 2.5 API Exp alias → Gemini 2.5 Flash pricing
gemini-2.5-pro-preview-03-25 Google – Gemini 2.5 API Preview alias → Gemini 2.5 Pro pricing
gemini-2.5-flash-lite-preview-06-17 Google – Gemini 2.5 API Preview alias → Gemini 2.5 Flash Lite pricing
gemini-2.0-flash-001 Google – Gemini 2.0 API Standard: $0.15/$0.60; Flex: $0.075/$0.30; Search 3.5¢/4.5¢
gemini-2.0-flash-lite-001 Google – Gemini 2.0 API Standard: $0.075/$0.30; Flex: $0.0375/$0.15; Search 3.5¢/4.5¢
gemini-2.0-flash-exp Google – Gemini 2.0 API Exp alias → Gemini 2.0 Flash pricing
gemini-2.0-flash-thinking-exp-01-21 Google – Gemini 2.0 API Thinking exp alias → Gemini 2.0 Flash pricing (no batch)
gemini-2.0-flash-image-generation Google – Gemini 2.0 API Image generation variant; image_token $30/1M
gemini-2.0-flash-preview-image-generation Google – Gemini 2.0 API Preview alias → Flash Image Generation pricing
gemini-2.0-pro-exp-02-05 Google – Gemini 2.0 API – price not found No pricing row for 2.0 Pro; added with price 0
gemini-1.5-pro-001 Google – Gemini 1.5 API – price not found Retired/no current pricing row; price 0
gemini-1.5-pro-002 Google – Gemini 1.5 API – price not found Retired/no current pricing row; price 0
gemini-1.5-flash-001 Google – Gemini 1.5 API – price not found Retired/no current pricing row; price 0
gemini-1.5-flash-002 Google – Gemini 1.5 API – price not found Retired/no current pricing row; price 0
gemini-1.5-flash-8b-001 Google – Gemini 1.5 API – price not found Retired/no current pricing row; price 0
gemini-1.0-pro-001 Google – Gemini 1.0 API – price not found Legacy; price 0
gemini-1.0-pro-002 Google – Gemini 1.0 API – price not found Legacy; price 0
gemini-1.0-pro-vision-001 Google – Gemini 1.0 API – price not found Legacy; price 0

Google – Gemini 3.x (token pricing, $/1M)

Model ID Publisher / Section Source Notes
gemini-3.1-pro Google – Gemini 3.1 API Standard: $2.00/$12.00, cache $0.20; Flex: $1.00/$6.00; Search $14/1K=1.4¢
gemini-3.1-flash-image Google – Gemini 3.1 API Flash Image: $0.50/$3.00; image_token $60/1M; Search 1.4¢
gemini-3.1-flash-lite Google – Gemini 3.1 API Flash Lite: $0.25/$1.50, cache $0.03; Flex: $0.125/$0.75; Search 1.4¢
gemini-3-pro Google – Gemini 3 API Standard: $2.00/$12.00, cache $0.20; Flex: $1.00/$6.00; Search 1.4¢
gemini-3-pro-image Google – Gemini 3 API Pro Image: $2.00/$12.00; image_token $120/1M; Search 1.4¢
gemini-3-flash Google – Gemini 3 API Standard: $0.50/$3.00, cache $0.05; Flex: $0.25/$1.50; Search 1.4¢

Google – Gemma

Model ID Publisher / Section Source Notes
gemma-4-26b-a4b-it-maas Google – Gemma 4 API $0.15/$0.60 (free until Apr 16, 2026 per page note)

Google – Imagen (per-image)

Model ID Publisher / Section Source Notes
imagen-4.0-generate-001 Google – Imagen 4 API $0.04/image
imagen-4.0-ultra-generate-001 Google – Imagen 4 Ultra API $0.06/image
imagen-4.0-fast-generate-001 Google – Imagen 4 Fast API $0.02/image
imagen-4.0-capability-001 Google – Imagen 4 API Capability model → uses Imagen 4.0 Generate pricing ($0.04/image)
imagen-4.0-generate-preview-05-20 Google – Imagen 4 API Preview alias → $0.04/image
imagen-4.0-ultra-generate-exp-05-20 Google – Imagen 4 Ultra API Exp alias → $0.06/image
imagen-3.0-generate-001 Google – Imagen 3 API $0.04/image
imagen-3.0-generate-002 Google – Imagen 3 API $0.04/image
imagen-3.0-fast-generate-001 Google – Imagen 3 Fast API $0.02/image
imagen-3.0-capability-001 Google – Imagen 3 API Capability model → uses Imagen 3.0 Generate pricing ($0.04/image)
imagen-2.0-generate-001 Google – Imagen 2 API $0.02/image
imagetext@001 Google – Imagen API – price not found Legacy visual Q&A model; no pricing row; price 0

Google – Veo (per-second video)

Model ID Publisher / Section Source Notes
veo-3.1-generate-001 Google – Veo 3.1 API Video only 720p/1080p: $0.20/sec → 20¢/sec; default 8s/1 sample
veo-3.1-fast-generate-001 Google – Veo 3.1 Fast API Video only 720p: $0.08/sec → 8¢/sec; default 8s/1 sample
veo-3.1-lite-generate-preview Google – Veo 3.1 Lite API Video only 720p: $0.03/sec → 3¢/sec; default 8s/1 sample
veo-3-generate-001 Google – Veo 3 API Video only: $0.20/sec → 20¢/sec; default 8s/1 sample
veo-3-fast-generate-preview Google – Veo 3 Fast API Video only 720p: $0.08/sec → 8¢/sec; default 8s/1 sample
veo-2-generate-001 Google – Veo 2 API $0.50/sec → 50¢/sec; default 8s/1 sample

Google – Embeddings

Model ID Publisher / Section Source Notes
gemini-embedding-001 Google – Gemini Embedding API $0.00015/1K tokens
gemini-embedding-exp-03-07 Google – Gemini Embedding API Exp alias → $0.00015/1K tokens (also listed as Gemini Embedding 2 Multimodal Preview at $0.20/1M text)
text-embedding-005 Google – Text Embedding API $0.000025/1K chars
text-embedding-004 Google – Text Embedding API $0.000025/1K chars
text-embedding-003 Google – Text Embedding API $0.000025/1K chars
text-embedding-large-exp-03-07 Google – Text Embedding API Experimental; used text-embedding-005 rate ($0.000025/1K)
textembedding-gecko@003 Google – Text Embedding API Legacy gecko; $0.000025/1K chars
textembedding-gecko@002 Google – Text Embedding API Legacy gecko; $0.000025/1K chars
textembedding-gecko@001 Google – Text Embedding API Legacy gecko; $0.000025/1K chars
textembedding-gecko-multilingual@001 Google – Text Embedding API Multilingual variant; $0.000025/1K chars
text-multilingual-embedding-002 Google – Text Embedding API Multilingual; $0.000025/1K chars
multimodalembedding@001 Google – Multimodal Embedding API Text $0.0002/1K chars; image $0.0001; video Plus $0.0020/sec, Standard $0.0010/sec, Essential $0.0005/sec

Anthropic – Claude

Model ID Publisher / Section Source Notes
claude-opus-4-6 Anthropic – Claude API $5/$25; cache_write $6.25 (5m), cache_read $0.50
claude-opus-4-5 Anthropic – Claude API $5/$25; cache_write $6.25, cache_read $0.50
claude-sonnet-4-6 Anthropic – Claude API $3/$15; cache_write $3.75, cache_read $0.30
claude-sonnet-4-5 Anthropic – Claude API $3/$15; cache_write $3.75, cache_read $0.30; long-context tier noted
claude-haiku-4-5 Anthropic – Claude API $1/$5; cache_write $1.25, cache_read $0.10
claude-opus-4-1 Anthropic – Claude API $15/$75; cache_write $18.75, cache_read $1.50
claude-opus-4 Anthropic – Claude API $15/$75; cache_write $18.75, cache_read $1.50
claude-sonnet-4 Anthropic – Claude API $3/$15; cache_write $3.75, cache_read $0.30

Meta – Llama

Model ID Publisher / Section Source Notes
llama-4-maverick-17b-128e-instruct-maas Meta – Llama 4 API $0.35/$1.15
llama-4-scout-17b-16e-instruct-maas Meta – Llama 4 API $0.25/$0.70
llama-3.3-70b-instruct-maas Meta – Llama 3.3 API $0.72/$0.72
llama-3.1-405b-instruct-maas Meta – Llama 3.1 API $5.00/$16.00
llama-3.1-70b-instruct-maas Meta – Llama 3.1 API – price not found No pricing row; price 0
llama-3.1-8b-instruct-maas Meta – Llama 3.1 API – price not found No pricing row; price 0
llama-3-405b-instruct-maas Meta – Llama 3 API – price not found Legacy; price 0
llama-3-70b-instruct-maas Meta – Llama 3 API – price not found Legacy; price 0
llama-3-8b-instruct-maas Meta – Llama 3 API – price not found Legacy; price 0
llama-2-70b-hf Meta – Llama 2 API – price not found Legacy; price 0
llama-2-13b-hf Meta – Llama 2 API – price not found Legacy; price 0
llama-2-7b-hf Meta – Llama 2 API – price not found Legacy; price 0
codellama-34b-hf Meta – Code Llama API – price not found Legacy; price 0

Mistral AI

Model ID Publisher / Section Source Notes
mistral-small-2503 Mistral AI API $0.10/$0.30
mistral-medium-3 Mistral AI API $0.40/$2.00
codestral-2501-maas Mistral AI API $0.30/$0.90
mistral-large-2411-maas Mistral AI API – price not found No current pricing row; price 0
pixtral-12b-2409 Mistral AI API – price not found No pricing row; price 0
pixtral-large-2411-maas Mistral AI API – price not found No pricing row; price 0
mistral-nemo-2407 Mistral AI API – price not found No pricing row; price 0
mistral-7b-instruct-maas Mistral AI API – price not found Legacy; price 0
mixtral-8x7b-instruct-v0.1-maas Mistral AI API – price not found Legacy; price 0
mixtral-8x22b-instruct-v0.1-maas Mistral AI API – price not found No pricing row; price 0

DeepSeek

Model ID Publisher / Section Source Notes
deepseek-v3.2-maas DeepSeek API $0.56/$1.68; cache_read $0.056
deepseek-v3.1-maas DeepSeek API $0.60/$1.70; cache_read $0.06
deepseek-r1-0528-maas DeepSeek API $1.35/$5.40
deepseek-r1-maas DeepSeek API $1.35/$5.40
deepseek-v2.5-maas DeepSeek API – price not found Legacy; price 0
deepseek-v2-chat-maas DeepSeek API – price not found Legacy; price 0
deepseek-coder-v2-instruct-maas DeepSeek API – price not found Legacy; price 0
deepseek-r1-distill-llama-70b-maas DeepSeek API – price not found Distill variant; no row; price 0
deepseek-r1-distill-qwen-32b-maas DeepSeek API – price not found Distill variant; no row; price 0
deepseek-r1-distill-qwen-14b-maas DeepSeek API – price not found Distill variant; no row; price 0

Qwen

Model ID Publisher / Section Source Notes
qwen3-235b-a22b-instruct-2507-maas Qwen API $0.22/$0.88
qwen3-coder-480b-a35b-instruct-maas Qwen API $0.22/$1.80; cache_read $0.022
qwen3-next-80b-thinking-maas Qwen API $0.15/$1.20
qwen3-next-80b-instruct-maas Qwen API $0.15/$1.20
qwen3-30b-a3b-instruct-maas Qwen API – price not found No pricing row; price 0
qwen3-32b-instruct-maas Qwen API – price not found No pricing row; price 0
qwen2-5-72b-instruct-maas Qwen API – price not found Legacy; price 0
qwen2-5-32b-instruct-maas Qwen API – price not found Legacy; price 0
qwen2-5-7b-instruct-maas Qwen API – price not found Legacy; price 0
qwen2-5-coder-32b-instruct-maas Qwen API – price not found Legacy; price 0

ZAI.org – GLM

Model ID Publisher / Section Source Notes
glm-4.7-maas ZAI.org – GLM API $0.60/$2.20
glm-5-maas ZAI.org – GLM API $1.00/$3.20; cache_read $0.10
glm-4v-9b ZAI.org – GLM API – price not found No pricing row; price 0
glm-4-9b ZAI.org – GLM API – price not found No pricing row; price 0
glm-4-air ZAI.org – GLM API – price not found No pricing row; price 0
glm-4-airx ZAI.org – GLM API – price not found No pricing row; price 0

MiniMax

Model ID Publisher / Section Source Notes
minimax-m2-maas MiniMax API $0.30/$1.20; cache_read $0.03
minimax-text-01-maas MiniMax API – price not found No pricing row; price 0

Moonshot / Kimi

Model ID Publisher / Section Source Notes
kimi-k2-thinking-maas Moonshot – Kimi API $0.60/$2.50; cache_read $0.06
kimi-k2-maas Moonshot – Kimi API – price not found No dedicated row; price 0
moonshot-v1-maas Moonshot – Kimi API – price not found Legacy; price 0

OpenAI

Model ID Publisher / Section Source Notes
gpt-oss-120b-maas OpenAI – GPT OSS API $0.09/$0.36
gpt-oss-20b-maas OpenAI – GPT OSS API $0.07/$0.25; cache_read $0.007

AI21

Model ID Publisher / Section Source Notes
jamba-large-1.6 AI21 – Jamba API – price not found No pricing row found on Vertex page; price 0
jamba-1.5-large AI21 – Jamba API – price not found Legacy; price 0
jamba-1.5-mini AI21 – Jamba API – price not found Legacy; price 0

Excluded Models (not added)

Pattern Publisher Reason
*-live-* Google Gemini Live streaming — separate product
lyria-* Google Music generation — no generative AI pricing endpoint
model-optimizer-* Google Dynamic routing meta-endpoint
imagegeneration (legacy) Google Superseded by Imagen 3.0+
virtual-try-on-* Google Retail product-specific model
glm-image ZAI.org Explicitly excluded per policy
qwen-image Qwen Explicitly excluded per policy
*-self-deploy (no -maas) All Self-deploy — customer-managed infra, not MaaS
mistral-ocr-* Mistral OCR model excluded per policy
whisper-* OpenAI Audio transcription, not generative LLM
Gemma, Codey, PaLM, CV/NLP models Google Non-generative or fine-tuning-only
llama-guard-* Meta Safety/guard model

Generated by Pricing Agent on 2026-04-11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant