Background texture

POST /v1/chat/completions

POST/v1/chat/completions

Creates a chat completion. Supports text, multimodal inputs, and streaming.

Request

Headers

HeaderValue
AuthorizationBearer {api_key}
Content-Typeapplication/json

Body

{
  "model": "gemini-3-flash-preview",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Hello, world!" }
  ],
  "stream": false,
  "temperature": 0.7,
  "max_tokens": 1024
}

Parameters

ParameterTypeRequiredDescription
modelstringYesModel ID. See supported models.
messagesarrayYesArray of message objects.
streambooleanNofalse (default) for JSON. true for SSE streaming.
temperaturenumberNoSampling temperature, 0 to 2.
top_pnumberNoNucleus sampling, 0 to 1.
max_tokensintegerNoMaximum tokens to generate.
stopstring | string[]NoUp to 4 stop sequences.
nintegerNoNumber of completions to generate.
presence_penaltynumberNo-2.0 to 2.0.
frequency_penaltynumberNo-2.0 to 2.0.
userstringNoEnd-user identifier for abuse detection.

Message object

FieldTypeRequiredDescription
rolestringYes"system", "user", or "assistant".
contentstring | arrayYesText string, or array of content parts for multimodal.
namestringNoParticipant name.

Content parts (multimodal)

When content is an array, each element is a content part:

Text: { "type": "text", "text": "Describe this image." }

Image: { "type": "image_url", "image_url": { "url": "https://..." } }

Audio: { "type": "input_audio", "input_audio": { "data": "<base64>", "format": "webm" } }


Response (non-streaming)

{
  "id": "chatcmpl-abc123",
  "object": "text_completion",
  "created": 1709123456,
  "model": "gemini-3-flash-preview",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

Response fields

FieldTypeDescription
idstringUnique completion ID.
objectstringAlways "text_completion".
createdintegerUnix timestamp.
modelstringModel used.
choices[].messageobject{ role, content }.
choices[].finish_reasonstring"stop", "length", or "content_filter".
usage.prompt_tokensintegerInput tokens.
usage.completion_tokensintegerOutput tokens.
usage.total_tokensintegerSum of input + output.

Response (streaming)

When stream: true, the response is a stream of Server-Sent Events.

Content-Type: text/event-stream
data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":2,"total_tokens":14}}

data: [DONE]

The final chunk contains usage with accumulated token counts.


Status codes

StatusDescription
200Success
400Invalid request (missing model/messages, wrong model type)
401Unauthorized
402Insufficient balance
502Upstream provider error

Examples

Basic chat

curl -X POST https://api.modelmax.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MODELMAX_API_KEY" \
  -d '{
    "model": "deepseek-v3.2",
    "messages": [
      { "role": "user", "content": "What is 2+2?" }
    ]
  }'
from openai import OpenAI

client = OpenAI(api_key="your-key", base_url="https://api.modelmax.io/v1")
response = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=[{"role": "user", "content": "What is 2+2?"}],
)
print(response.choices[0].message.content)
import OpenAI from "openai";

const client = new OpenAI({ apiKey: "your-key", baseURL: "https://api.modelmax.io/v1" });
const response = await client.chat.completions.create({
  model: "deepseek-v3.2",
  messages: [{ role: "user", content: "What is 2+2?" }],
});
console.log(response.choices[0].message.content);

With vision

curl -X POST https://api.modelmax.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MODELMAX_API_KEY" \
  -d '{
    "model": "gemini-3-flash-preview",
    "messages": [
      {
        "role": "user",
        "content": [
          { "type": "text", "text": "What is in this image?" },
          { "type": "image_url", "image_url": { "url": "https://example.com/cat.jpg" } }
        ]
      }
    ]
  }'
response = client.chat.completions.create(
    model="gemini-3-flash-preview",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {"type": "image_url", "image_url": {"url": "https://example.com/cat.jpg"}},
            ],
        }
    ],
)
const response = await client.chat.completions.create({
  model: "gemini-3-flash-preview",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "What is in this image?" },
        { type: "image_url", image_url: { url: "https://example.com/cat.jpg" } },
      ],
    },
  ],
});