POST /v1/chat/completions
POST
/v1/chat/completionsCreates a chat completion. Supports text, multimodal inputs, and streaming.
Request
Headers
| Header | Value |
|---|---|
Authorization | Bearer {api_key} |
Content-Type | application/json |
Body
{
"model": "gemini-3-flash-preview",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Hello, world!" }
],
"stream": false,
"temperature": 0.7,
"max_tokens": 1024
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID. See supported models. |
messages | array | Yes | Array of message objects. |
stream | boolean | No | false (default) for JSON. true for SSE streaming. |
temperature | number | No | Sampling temperature, 0 to 2. |
top_p | number | No | Nucleus sampling, 0 to 1. |
max_tokens | integer | No | Maximum tokens to generate. |
stop | string | string[] | No | Up to 4 stop sequences. |
n | integer | No | Number of completions to generate. |
presence_penalty | number | No | -2.0 to 2.0. |
frequency_penalty | number | No | -2.0 to 2.0. |
user | string | No | End-user identifier for abuse detection. |
Message object
| Field | Type | Required | Description |
|---|---|---|---|
role | string | Yes | "system", "user", or "assistant". |
content | string | array | Yes | Text string, or array of content parts for multimodal. |
name | string | No | Participant name. |
Content parts (multimodal)
When content is an array, each element is a content part:
Text: { "type": "text", "text": "Describe this image." }
Image: { "type": "image_url", "image_url": { "url": "https://..." } }
Audio: { "type": "input_audio", "input_audio": { "data": "<base64>", "format": "webm" } }
Response (non-streaming)
{
"id": "chatcmpl-abc123",
"object": "text_completion",
"created": 1709123456,
"model": "gemini-3-flash-preview",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 9,
"total_tokens": 21
}
}
Response fields
| Field | Type | Description |
|---|---|---|
id | string | Unique completion ID. |
object | string | Always "text_completion". |
created | integer | Unix timestamp. |
model | string | Model used. |
choices[].message | object | { role, content }. |
choices[].finish_reason | string | "stop", "length", or "content_filter". |
usage.prompt_tokens | integer | Input tokens. |
usage.completion_tokens | integer | Output tokens. |
usage.total_tokens | integer | Sum of input + output. |
Response (streaming)
When stream: true, the response is a stream of Server-Sent Events.
Content-Type: text/event-stream
data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":2,"total_tokens":14}}
data: [DONE]
The final chunk contains usage with accumulated token counts.
Status codes
| Status | Description |
|---|---|
200 | Success |
400 | Invalid request (missing model/messages, wrong model type) |
401 | Unauthorized |
402 | Insufficient balance |
502 | Upstream provider error |
Examples
Basic chat
curl -X POST https://api.modelmax.io/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $MODELMAX_API_KEY" \
-d '{
"model": "deepseek-v3.2",
"messages": [
{ "role": "user", "content": "What is 2+2?" }
]
}'
from openai import OpenAI
client = OpenAI(api_key="your-key", base_url="https://api.modelmax.io/v1")
response = client.chat.completions.create(
model="deepseek-v3.2",
messages=[{"role": "user", "content": "What is 2+2?"}],
)
print(response.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({ apiKey: "your-key", baseURL: "https://api.modelmax.io/v1" });
const response = await client.chat.completions.create({
model: "deepseek-v3.2",
messages: [{ role: "user", content: "What is 2+2?" }],
});
console.log(response.choices[0].message.content);
With vision
curl -X POST https://api.modelmax.io/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $MODELMAX_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "What is in this image?" },
{ "type": "image_url", "image_url": { "url": "https://example.com/cat.jpg" } }
]
}
]
}'
response = client.chat.completions.create(
model="gemini-3-flash-preview",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/cat.jpg"}},
],
}
],
)
const response = await client.chat.completions.create({
model: "gemini-3-flash-preview",
messages: [
{
role: "user",
content: [
{ type: "text", text: "What is in this image?" },
{ type: "image_url", image_url: { url: "https://example.com/cat.jpg" } },
],
},
],
});
