POST /v1/chat/completions

POST/v1/chat/completions

Creates a chat completion. Supports text, multimodal inputs, and streaming.

Request

Headers

Header	Value
`Authorization`	`Bearer {api_key}`
`Content-Type`	`application/json`

Body

{
  "model": "gemini-3-flash-preview",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Hello, world!" }
  ],
  "stream": false,
  "temperature": 0.7,
  "max_tokens": 1024
}

Parameters

Parameter	Type	Required	Description
`model`	`string`	Yes	Model ID. See supported models.
`messages`	`array`	Yes	Array of message objects.
`stream`	`boolean`	No	`false` (default) for JSON. `true` for SSE streaming.
`temperature`	`number`	No	Sampling temperature, `0` to `2`.
`top_p`	`number`	No	Nucleus sampling, `0` to `1`.
`max_tokens`	`integer`	No	Maximum tokens to generate.
`stop`	`string \| string[]`	No	Up to 4 stop sequences.
`n`	`integer`	No	Number of completions to generate.
`presence_penalty`	`number`	No	`-2.0` to `2.0`.
`frequency_penalty`	`number`	No	`-2.0` to `2.0`.
`user`	`string`	No	End-user identifier for abuse detection.

Message object

Field	Type	Required	Description
`role`	`string`	Yes	`"system"`, `"user"`, or `"assistant"`.
`content`	`string \| array`	Yes	Text string, or array of content parts for multimodal.
`name`	`string`	No	Participant name.

Content parts (multimodal)

When content is an array, each element is a content part:

Text: { "type": "text", "text": "Describe this image." }

Image: { "type": "image_url", "image_url": { "url": "https://..." } }

Audio: { "type": "input_audio", "input_audio": { "data": "<base64>", "format": "webm" } }

Response (non-streaming)

{
  "id": "chatcmpl-abc123",
  "object": "text_completion",
  "created": 1709123456,
  "model": "gemini-3-flash-preview",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

Response fields

Field	Type	Description
`id`	`string`	Unique completion ID.
`object`	`string`	Always `"text_completion"`.
`created`	`integer`	Unix timestamp.
`model`	`string`	Model used.
`choices[].message`	`object`	`{ role, content }`.
`choices[].finish_reason`	`string`	`"stop"`, `"length"`, or `"content_filter"`.
`usage.prompt_tokens`	`integer`	Input tokens.
`usage.completion_tokens`	`integer`	Output tokens.
`usage.total_tokens`	`integer`	Sum of input + output.

Response (streaming)

When stream: true, the response is a stream of Server-Sent Events.

Content-Type: text/event-stream

data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":2,"total_tokens":14}}

data: [DONE]

The final chunk contains usage with accumulated token counts.

Status codes

Status	Description
`200`	Success
`400`	Invalid request (missing model/messages, wrong model type)
`401`	Unauthorized
`402`	Insufficient balance
`502`	Upstream provider error

Examples

Basic chat

curl -X POST https://api.modelmax.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MODELMAX_API_KEY" \
  -d '{
    "model": "deepseek-v3.2",
    "messages": [
      { "role": "user", "content": "What is 2+2?" }
    ]
  }'

from openai import OpenAI

client = OpenAI(api_key="your-key", base_url="https://api.modelmax.io/v1")
response = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=[{"role": "user", "content": "What is 2+2?"}],
)
print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({ apiKey: "your-key", baseURL: "https://api.modelmax.io/v1" });
const response = await client.chat.completions.create({
  model: "deepseek-v3.2",
  messages: [{ role: "user", content: "What is 2+2?" }],
});
console.log(response.choices[0].message.content);

With vision

curl -X POST https://api.modelmax.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MODELMAX_API_KEY" \
  -d '{
    "model": "gemini-3-flash-preview",
    "messages": [
      {
        "role": "user",
        "content": [
          { "type": "text", "text": "What is in this image?" },
          { "type": "image_url", "image_url": { "url": "https://example.com/cat.jpg" } }
        ]
      }
    ]
  }'

response = client.chat.completions.create(
    model="gemini-3-flash-preview",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {"type": "image_url", "image_url": {"url": "https://example.com/cat.jpg"}},
            ],
        }
    ],
)

const response = await client.chat.completions.create({
  model: "gemini-3-flash-preview",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "What is in this image?" },
        { type: "image_url", image_url: { url: "https://example.com/cat.jpg" } },
      ],
    },
  ],
});