ModelMax
One API, every model. ModelMax is a unified LLM gateway that gives you access to models from multiple providers through a single, OpenAI-compatible API.
Why ModelMax?
- Single integration — Write once, switch models freely. No provider-specific SDKs.
- OpenAI-compatible — Drop-in replacement for
/v1/chat/completions. Works with any OpenAI SDK. - Native Pass-Through (Gateway) — Support for native Anthropic (
/v1/messages) and Gemini (/v1beta/models/...) endpoints with zero modification token sniffing. - Multi-provider — Access models from AWS Bedrock, Google Gemini, and Anthropic.
Supported capabilities
| Capability | Endpoint | Status |
|---|---|---|
| OpenAI Chat | POST /v1/chat/completions | Available |
| Anthropic Chat | POST /v1/messages | Available |
| Gemini Chat | POST /v1beta/models/* | Available |
| Streaming (SSE) | POST /v1/chat/completions | Available |
| Image generation | POST /v1/images/generations | Available |
| Video generation | POST /v1/queue/{model} | Available |
| Model listing | GET /v1/models | Available |
Quick links
- Quickstart — Get running in 5 minutes
- Chat completions guide — Text and streaming
- Video generation guide — Async queue workflow
- Supported models — Full model catalog
