POST /v1/chat/completions

POST/v1/chat/completions

创建对话补全。支持文本、多模态输入和流式传输。

请求

请求头

请求头	值
`Authorization`	`Bearer {api_key}`
`Content-Type`	`application/json`

请求体

{
  "model": "gemini-3-flash-preview",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Hello, world!" }
  ],
  "stream": false,
  "temperature": 0.7,
  "max_tokens": 1024
}

参数

参数	类型	必填	描述
`model`	`string`	是	模型 ID。参见支持的模型。
`messages`	`array`	是	消息对象数组。
`stream`	`boolean`	否	`false`（默认）返回 JSON。`true` 启用 SSE 流式传输。
`temperature`	`number`	否	采样温度，`0` 到 `2`。
`top_p`	`number`	否	核采样，`0` 到 `1`。
`max_tokens`	`integer`	否	最大生成 token 数。
`stop`	`string \| string[]`	否	最多 4 个停止序列。
`n`	`integer`	否	生成的补全数量。
`presence_penalty`	`number`	否	`-2.0` 到 `2.0`。
`frequency_penalty`	`number`	否	`-2.0` 到 `2.0`。
`user`	`string`	否	终端用户标识符，用于滥用检测。

消息对象

字段	类型	必填	描述
`role`	`string`	是	`"system"`、`"user"` 或 `"assistant"`。
`content`	`string \| array`	是	文本字符串，或多模态内容部分数组。
`name`	`string`	否	参与者名称。

内容部分（多模态）

当 content 为数组时，每个元素是一个内容部分：

文本： { "type": "text", "text": "Describe this image." }

图像： { "type": "image_url", "image_url": { "url": "https://..." } }

音频： { "type": "input_audio", "input_audio": { "data": "<base64>", "format": "webm" } }

响应（非流式）

{
  "id": "chatcmpl-abc123",
  "object": "text_completion",
  "created": 1709123456,
  "model": "gemini-3-flash-preview",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

响应字段

字段	类型	描述
`id`	`string`	唯一补全 ID。
`object`	`string`	始终为 `"text_completion"`。
`created`	`integer`	Unix 时间戳。
`model`	`string`	使用的模型。
`choices[].message`	`object`	`{ role, content }`。
`choices[].finish_reason`	`string`	`"stop"`、`"length"` 或 `"content_filter"`。
`usage.prompt_tokens`	`integer`	输入 token 数。
`usage.completion_tokens`	`integer`	输出 token 数。
`usage.total_tokens`	`integer`	输入 + 输出总和。

响应（流式）

当 stream: true 时，响应为 Server-Sent Events 流。

Content-Type: text/event-stream

data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":2,"total_tokens":14}}

data: [DONE]

最后一个数据块包含累积 token 计数的 usage 信息。

状态码

状态码	描述
`200`	成功
`400`	无效请求（缺少 model/messages，错误的模型类型）
`401`	未授权
`402`	余额不足
`502`	上游供应商错误

示例

基本对话

curl -X POST https://api.modelmax.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MODELMAX_API_KEY" \
  -d '{
    "model": "deepseek-v3.2",
    "messages": [
      { "role": "user", "content": "What is 2+2?" }
    ]
  }'

from openai import OpenAI

client = OpenAI(api_key="your-key", base_url="https://api.modelmax.io/v1")
response = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=[{"role": "user", "content": "What is 2+2?"}],
)
print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({ apiKey: "your-key", baseURL: "https://api.modelmax.io/v1" });
const response = await client.chat.completions.create({
  model: "deepseek-v3.2",
  messages: [{ role: "user", content: "What is 2+2?" }],
});
console.log(response.choices[0].message.content);

带视觉

curl -X POST https://api.modelmax.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MODELMAX_API_KEY" \
  -d '{
    "model": "gemini-3-flash-preview",
    "messages": [
      {
        "role": "user",
        "content": [
          { "type": "text", "text": "What is in this image?" },
          { "type": "image_url", "image_url": { "url": "https://example.com/cat.jpg" } }
        ]
      }
    ]
  }'

response = client.chat.completions.create(
    model="gemini-3-flash-preview",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {"type": "image_url", "image_url": {"url": "https://example.com/cat.jpg"}},
            ],
        }
    ],
)

const response = await client.chat.completions.create({
  model: "gemini-3-flash-preview",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "What is in this image?" },
        { type: "image_url", image_url: { url: "https://example.com/cat.jpg" } },
      ],
    },
  ],
});