Supported Models

ModelMax routes requests to the best available provider. All models are accessed through a single API.

Chat models

Text-in, text-out conversation models. Used with POST /v1/chat/completions.

Google Gemini

Model ID	Input	Output	Notes
`gemini-3.1-pro-preview`	text, image, audio, video	text	Latest Gemini Pro
`gemini-3-pro-preview`	text, image, audio, video	text	Gemini 3.0 Pro
`gemini-3-flash-preview`	text, image, audio, video	text	Fast, cost-effective
`gemini-3.1-flash-image-preview`	text, image	text, image	Image generation capable
`gemini-3.1-flash-lite-preview`	text, image, audio, video	text	Lightweight

xAI Grok on Google Cloud

Model ID	Input	Output	Notes
`grok-4.3`	text, image	text	xAI flagship model hosted on Google Cloud
`grok-4.1-fast-non-reasoning`	text, image	text	Fast, cost-effective non-reasoning model

OpenAI

Model ID	Input	Output	Notes
`gpt-5.5`	text, image	text	Flagship GPT model
`gpt-5.4`	text, image	text	High-capability GPT model
`gpt-5.4-mini`	text, image	text	Balanced GPT model
`gpt-5.4-nano`	text, image	text	Fast, low-cost GPT model

Anthropic Claude

Model ID	Input	Output	Notes
`claude-opus-4-8`	text, image	text	Highest-capability Claude model
`claude-sonnet-4-6`	text, image	text	Balanced Claude model
`claude-haiku-4-5`	text, image	text	Fast Claude model

DeepSeek

Model ID	Input	Output	Notes
`deepseek-r1`	text	text	Reasoning model with chain-of-thought
`deepseek-v3.1`	text	text	General purpose
`deepseek-v3.2`	text	text	Latest general purpose
`deepseek-v4-flash`	text	text	Fast direct DeepSeek model
`deepseek-v4-pro`	text	text	Higher-capability direct DeepSeek model

Qwen

Model ID	Input	Output	Notes
`qwen3-coder-30b-a3b`	text	text	Code-focused, lightweight
`qwen3-32b`	text	text	General purpose
`qwen3-235b-a22b-2507`	text	text	Large, capable
`qwen3-coder-480b-a35b`	text	text	Code-focused, high capacity
`qwen3-next-80b-a3b`	text	text	Efficient MoE architecture
`qwen3-vl-235b-a22b`	text, image	text	Vision-language model
`qwen3-coder-next`	text	text	Latest coder model
`qwen3-max`	text	text	Flagship direct Qwen model
`qwen3.5-plus`	text	text	Balanced direct Qwen model
`qwen3.5-flash`	text	text	Fast direct Qwen model

MiniMax

Model ID	Input	Output	Notes
`minimax-m2`	text	text	MiniMax M2
`minimax-m2.1`	text	text	MiniMax M2.1
`minimax-m2.7`	text	text	Direct MiniMax model
`minimax-m2.5`	text	text	Direct MiniMax model

Kimi (Moonshot)

Model ID	Input	Output	Notes
`kimi-k2.6`	text	text	Latest direct Kimi model
`kimi-k2-thinking`	text	text	With reasoning
`kimi-k2.5`	text	text	Latest Kimi

Zhipu GLM

Model ID	Input	Output	Notes
`glm-5.1`	text	text	Direct GLM model

Volcengine Ark — Doubao

Model ID	Input	Output	Notes
`doubao-seed-2-0-lite-260428`	text	text	Fast, efficient Doubao model
`doubao-seed-2-0-mini-260428`	text	text	Compact Doubao model
`doubao-seed-2-0-pro-260215`	text	text	Higher-capability Doubao model
`doubao-seed-2-0-code-preview-260215`	text	text	Code-focused Doubao model

Baidu Qianfan — ERNIE

Model ID	Input	Output	Notes
`ernie-5.1`	text	text	Direct ERNIE model

Tencent TokenHub — Hunyuan

Model ID	Input	Output	Notes
`hy3-preview`	text	text	Hunyuan 3 preview model

Pricing defaults are stored in the model seed data and can be adjusted in the admin model configuration when your provider contract differs from the public pricing.

Video models

Async video generation. Used with the Queue API (POST /v1/queue/{model}).

Google Gemini — Veo

Model ID	Speed	Quality	Audio
`veo-3.1`	Standard	Highest	Yes
`veo-3.1-fast`	Fast	High	Yes
`veo-3`	Standard	High	Yes
`veo-3-fast`	Fast	Good	Yes
`veo-2`	Standard	Good	Yes

All Veo models support:

Text-to-video and image-to-video
Up to 8 seconds duration
720p, 1080p, and 4K resolution
Audio generation

Capability matrix

Capability	Gemini Chat	Domestic Text	Qwen VL	Veo Video
Text input	Yes	Yes	Yes	Yes
Image input	Yes	—	Yes	Yes
Audio input	Yes	—	—	—
Video input	Yes	—	—	—
Text output	Yes	Yes	Yes	—
Image output	Partial	—	—	—
Video output	—	—	—	Yes
Streaming	Yes	Yes	Yes	—

Choosing a model

For general chat: Start with gemini-3-flash-preview (fast, multimodal) or deepseek-v3.2 (strong text reasoning).

For code: Try qwen3-coder-480b-a35b or qwen3-coder-next.

For reasoning: Use deepseek-r1, glm-5.1, or kimi-k2.6.

For high-throughput domestic routing: Use doubao-seed-2-0-lite-260428, qwen3.5-flash, or hy3-preview.

For vision: Use gemini-3-flash-preview (images, audio, video input) or qwen3-vl-235b-a22b (images only).

For video generation: Use veo-3 for quality, veo-3-fast for speed.