Background texture

ModelMax

One API, every model. ModelMax is a unified LLM gateway that gives you access to models from multiple providers through a single, OpenAI-compatible API.

Why ModelMax?

Single integration — Write once, switch models freely. No provider-specific SDKs.
OpenAI-compatible — Drop-in replacement for /v1/chat/completions. Works with any OpenAI SDK.
Native Pass-Through (Gateway) — Support for native Anthropic (/v1/messages) and Gemini (/v1beta/models/...) endpoints with zero modification token sniffing.
Multi-provider — Access models from AWS Bedrock, Google Gemini, and Anthropic.

Supported capabilities

Capability	Endpoint	Status
OpenAI Chat	`POST /v1/chat/completions`	Available
Anthropic Chat	`POST /v1/messages`	Available
Gemini Chat	`POST /v1beta/models/*`	Available
Streaming (SSE)	`POST /v1/chat/completions`	Available
Image generation	`POST /v1/images/generations`	Available
Video generation	`POST /v1/queue/{model}`	Available
Model listing	`GET /v1/models`	Available

Quick links

Quickstart — Get running in 5 minutes
Chat completions guide — Text and streaming
Video generation guide — Async queue workflow
Supported models — Full model catalog