chore(pricing): Update vertex-ai pricing by siddharthsambharia-portkey · Pull Request #625 · Portkey-AI/models

siddharthsambharia-portkey · 2026-04-09T18:27:04Z

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type	Count
➕ Models added	4
🔄 Models updated (merged)	10

➕ New Models

gemini-2.5-pro-tts
gemini-2.5-flash-tts
veo-3.1-lite-generate-001
translate-llm

🔄 Updated Models

gemini-3.1-pro-preview
gemini-3.1-flash-image-preview
gemini-3.1-flash-lite-preview
gemini-3-pro-image-preview
gemini-3-flash-preview
gemini-2.5-pro
textembedding-gecko@001
textembedding-gecko-multilingual@001
multimodalembedding@001
veo-3.1-fast-generate-001

Model → Pricing Page Mapping

Model ID	Publisher / Section	Source	Notes
`gemini-3.1-pro-preview`	Google – Gemini 3.1 Pro	API	Standard $2/$12/1M, batch $1/$6, cache_read $0.20, web_search $14/1000
`gemini-3.1-flash-image-preview`	Google – Gemini 3.1 Flash Image	API	Standard $0.50/$3/1M, batch $0.25/$1.50, image_token $60/1M
`gemini-3.1-flash-lite-preview`	Google – Gemini 3.1 Flash Lite	API	Standard $0.25/$1.50/1M, batch $0.13/$0.75, cache_read $0.03
`gemini-3-pro-preview`	Google – Gemini 3 Pro	API	Standard $2/$12/1M, batch $1/$6, cache_read $0.20
`gemini-3-pro-image-preview`	Google – Gemini 3 Pro Image	API	Standard $2/$12/1M, batch $1/$6, image_token $120/1M
`gemini-3-flash-preview`	Google – Gemini 3 Flash	API	Standard $0.50/$3/1M, batch $0.25/$1.50, cache_read $0.05
`gemini-2.5-pro`	Google – Gemini 2.5 Pro	API	Standard $1.25/$10/1M (≤200K), batch $0.625/$5, cache_read $0.13
`gemini-2.5-flash`	Google – Gemini 2.5 Flash	API	Standard $0.30/$2.50/1M, batch $0.15/$1.25, cache_read $0.03
`gemini-2.5-flash-image`	Google – Gemini 2.5 Flash Image	API	Standard $0.30/$2.50/1M, batch $0.15/$1.25, image_token $30/1M
`gemini-2.5-flash-lite`	Google – Gemini 2.5 Flash Lite	API	Standard $0.10/$0.40/1M, batch $0.05/$0.20, cache_read $0.01
`gemini-2.5-computer-use-preview-10-2025`	Google – Gemini 2.5 Pro (Computer Use)	API	Same pricing as Gemini 2.5 Pro, no batch
`gemini-2.5-pro-tts`	Google – Gemini (TTS)	API – price not found	No dedicated TTS row on pricing page; added with price 0
`gemini-2.5-flash-tts`	Google – Gemini (TTS)	API – price not found	No dedicated TTS row on pricing page; added with price 0
`gemini-2.5-flash-preview-09-2025`	Google – Gemini 2.5 Flash	API	Preview alias for gemini-2.5-flash; same pricing
`gemini-2.5-flash-lite-preview-09-2025`	Google – Gemini 2.5 Flash Lite	API	Preview alias for gemini-2.5-flash-lite; same pricing
`gemini-2.0-flash-001`	Google – Gemini 2.0 Flash	API	Standard $0.15/$0.60/1M, batch $0.075/$0.30
`gemini-2.0-flash-lite-001`	Google – Gemini 2.0 Flash Lite	API	Standard $0.075/$0.30/1M, batch $0.0375/$0.15
`gemini-embedding-001`	Google – Embedding	API	$0.00015/1K tokens (online)
`text-embedding-005`	Google – Embedding	API	$0.000025/1K chars (online)
`text-multilingual-embedding-002`	Google – Embedding	API	$0.000025/1K chars; same as text-embedding-005
`textembedding-gecko@001`	Google – Embedding (legacy)	API	Legacy; same pricing as text-embedding-005
`textembedding-gecko@003`	Google – Embedding (legacy)	API	Legacy; same pricing as text-embedding-005
`textembedding-gecko-multilingual@001`	Google – Embedding (legacy)	API	Legacy multilingual; same pricing as text-embedding-005
`text-embedding-large-exp-03-07`	Google – Embedding (experimental)	API	Experimental; no dedicated row; uses text-embedding-005 pricing
`multimodalembedding@001`	Google – Multimodal Embedding	API	Text $0.0002/1K chars; image $0.0001/img; video plus $0.0020/s, standard $0.0010/s, essential $0.0005/s
`gemini-embedding-2-preview`	Google – Gemini Embedding 2 (multimodal, preview)	API	Text $0.20/1M tokens; image $0.00012/img; video $0.00079/img; audio $0.00016/s
`imagen-3.0-generate-002`	Google – Imagen 3 Generate	API	$0.04/image
`imagen-4.0-generate-001`	Google – Imagen 4.0 Generate	API	$0.04/image
`imagen-4.0-fast-generate-001`	Google – Imagen 4.0 Fast Generate	API	$0.02/image
`imagen-4.0-ultra-generate-001`	Google – Imagen 4.0 Ultra Generate	API	$0.06/image
`imagen-3.0-capability-001`	Google – Imagen 3 Capability	API	Maps to imagen-3.0-generate-002 pricing; $0.04/image
`imagen-3.0-capability-002`	Google – Imagen 3 Capability	API	Maps to imagen-3.0-generate-002 pricing; $0.04/image
`veo-2.0-generate-001`	Google – Veo 2.0	API	$0.50/s, default 8s, 1 sample
`veo-3.0-generate-001`	Google – Veo 3.0	API	$0.20/s, default 8s, 1 sample
`veo-3.0-fast-generate-001`	Google – Veo 3.0 Fast	API	$0.10/s, default 8s, 1 sample
`veo-3.1-generate-001`	Google – Veo 3.1	API	$0.20/s, default 8s, 1 sample
`veo-3.1-fast-generate-001`	Google – Veo 3.1 Fast	API	$0.10/s, default 8s, 1 sample
`veo-3.1-lite-generate-001`	Google – Veo 3.1 Lite	API	$0.03/s, default 8s, 1 sample
`translate-llm`	Google – Translation	API – price not found	Translation model; no row on generative AI pricing page; added with price 0
`claude-opus-4-6`	Anthropic – Claude Opus 4.6	API	$5/$25/1M; cache_write_5m $6.25, cache_read $0.50; @default stripped
`claude-sonnet-4-6`	Anthropic – Claude Sonnet 4.6	API	$3/$15/1M; cache_write_5m $3.75, cache_read $0.30; @default stripped
`claude-opus-4-5@20251101`	Anthropic – Claude Opus 4.5	API	$5/$25/1M; cache_write_5m $6.25, cache_read $0.50
`claude-sonnet-4-5@20250929`	Anthropic – Claude Sonnet 4.5	API	$3/$15/1M (≤200K); cache_write_5m $3.75, cache_read $0.30
`claude-haiku-4-5@20251001`	Anthropic – Claude Haiku 4.5	API	$1/$5/1M; cache_write_5m $1.25, cache_read $0.10
`claude-opus-4-1@20250805`	Anthropic – Claude Opus 4.1	API	$15/$75/1M; cache_write_5m $18.75, cache_read $1.50
`claude-opus-4@20250514`	Anthropic – Claude Opus 4	API	$15/$75/1M; cache_write_5m $18.75, cache_read $1.50
`claude-sonnet-4@20250514`	Anthropic – Claude Sonnet 4	API	$3/$15/1M; cache_write_5m $3.75, cache_read $0.30
`llama-3.3-70b-instruct-maas`	Meta – Llama 3.3 70B	API	$0.72/$0.72/1M
`llama-4-maverick-17b-128e-instruct-maas`	Meta – Llama 4 Maverick	API	$0.35/$1.15/1M
`mistral-small-2503`	Mistral – Mistral Small 3.1	API	$0.10/$0.30/1M
`mistral-medium-3`	Mistral – Mistral Medium 3	API	$0.40/$2.00/1M
`codestral-2`	Mistral – Codestral 2	API	$0.30/$0.90/1M
`deepseek-r1-0528-maas`	DeepSeek – DeepSeek-R1 0528	API	$1.35/$5.40/1M
`deepseek-v3.1-maas`	DeepSeek – DeepSeek-V3.1	API	$0.60/$1.70/1M; cache_read $0.06
`deepseek-v3.2-maas`	DeepSeek – DeepSeek-V3.2	API	$0.56/$1.68/1M; cache_read $0.056
`qwen3-235b-a22b-instruct-2507-maas`	Qwen – Qwen3-235B-A22B-Instruct-2507	API	$0.22/$0.88/1M
`qwen3-coder-480b-a35b-instruct-maas`	Qwen – Qwen3-Coder-480B	API	$0.22/$1.80/1M; cache_read $0.022
`qwen3-next-80b-a3b-instruct-maas`	Qwen – Qwen3-Next-80B-Instruct	API	$0.15/$1.20/1M
`qwen3-next-80b-a3b-thinking-maas`	Qwen – Qwen3-Next-80B-Thinking	API	$0.15/$1.20/1M
`kimi-k2-thinking-maas`	Kimi/Moonshot – Kimi-K2-Thinking	API	$0.60/$2.50/1M; cache_read $0.06
`minimax-m2-maas`	MiniMax – MiniMax-M2	API	$0.30/$1.20/1M; cache_read $0.03
`glm-4.7-maas`	ZAI.org/GLM – GLM-4.7	API	$0.60/$2.20/1M
`glm-5-maas`	ZAI.org/GLM – GLM-5	API	$1.00/$3.20/1M; cache_read $0.10; free until Feb 19, 2026 per page note
`gpt-oss-120b-maas`	OpenAI – gpt-oss-120b	API	$0.09/$0.36/1M

Excluded Models (not added to pricing JSON)

Model	Publisher	Reason
`gemini-live-2.5-flash-native-audio`	Google	`-live-` pattern — Gemini Live streaming, separate product
`lyria-002`, `lyria-3-pro-preview`, `lyria-3-clip-preview`	Google	`lyria-*` — music generation, no inference endpoint
`virtual-try-on-001`	Google	Product-specific retail model
`imagegeneration`	Google	Legacy, superseded by imagen-3.0+
`shieldgemma2`	Google	Guard model (`guard` pattern)
`chirp-*` models	Google	Audio transcription
`weathernext`, `weather-next-v2`	Google	Weather forecasting, not generative AI
`imageclassification-*`, `occupancy-analytics`, `bert-base`, `vehicle-detector`	Google	Non-generative CV/NLP
`pretrained-ocr`, `text-detector`, `pretrained-form-parser`	Google	OCR models
`faster-r-cnn`, `retinanet`, `mask-r-cnn`	Google	Computer vision, non-generative
`t5gemma`, `earth-ai-*`	Google	Self-deploy (`has_deploy: true`, no `-maas`)
`jamba-large-1.6`	AI21	Self-deploy (`has_deploy: true`, no `-maas`)
`codestral-2501-self-deploy`, `ministral-3`, `mistral-large-3`	Mistral	Self-deploy (no `-maas`)
`mistral-ocr-2505`	Mistral	OCR model
`deepseek-ocr-maas`, `deepseek-ocr`, `deepseek-ocr-2`	DeepSeek	OCR models
`llama-guard`, `prompt-guard`	Meta	Guard/safety models
`codellama-7b-hf`, `llama2`, `llama-2-quantized`, `llama3`, `llama3_1`, `llama3-2`, `llama3-3`, `llama4`	Meta	Self-deploy (no `-maas`)
`sam3`	Meta	Image segmentation, non-generative
`nllb`, `imagebind`	Meta	Self-deploy
`kimi-k2-5`, `kimi-k2`	Kimi	Self-deploy (no `-maas`)
`minimax-m2`	MiniMax	Self-deploy (no `-maas`)
`glm-image`	ZAI.org	Explicitly excluded per global rules
`glm-ocr`, `glm-4.7`, `glm-5`, `glm-4.5`	ZAI.org	OCR / self-deploy (no `-maas`)
`qwen-image`	Qwen	Explicitly excluded per global rules
`qwq`, `qwen3`, `qwen3-embedding`, `qwen3-5`, `qwen2`, `qwen3-coder-next`, `qwen3-coder`, `qwen3-next`, `qwen3-vl`	Qwen	Self-deploy (no `-maas`)
`gpt-oss`	OpenAI	Self-deploy (no `-maas`)
`clip-vit-base-patch32`, `openclip`	OpenAI	Non-generative embedding/vision
`whisper-large`	OpenAI	Audio transcription, not generative inference

Data Sources

Vertex AI Generative AI Pricing (Global tab): https://cloud.google.com/vertex-ai/generative-ai/pricing#global
Claude on Vertex AI model IDs: https://platform.claude.com/docs/en/build-with-claude/claude-on-vertex-ai
get_vertex_models API: all publishers (google, anthropic, openai, meta, ai21, qwen, mistral-ai, mistralai, deepseek-ai, moonshotai, minimaxai, zai-org)

Generated by Pricing Agent on 2026-04-09

siddharthsambharia-portkey added 2 commits April 9, 2026 23:57

chore(pricing): Update vertex-ai pricing

1f8851b

chore(general): Add 2 new vertex-ai model configs

db60302

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(pricing): Update vertex-ai pricing#625

chore(pricing): Update vertex-ai pricing#625
siddharthsambharia-portkey wants to merge 2 commits intomainfrom
pricing-update/vertex-ai-24206118106

siddharthsambharia-portkey commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

siddharthsambharia-portkey commented Apr 9, 2026

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

➕ New Models

🔄 Updated Models

Model → Pricing Page Mapping

Excluded Models (not added to pricing JSON)

Data Sources

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant