POST /v1/chat/completions
POST
/v1/chat/completions创建对话补全。支持文本、多模态输入和流式传输。
请求
请求头
| 请求头 | 值 |
|---|---|
Authorization | Bearer {api_key} |
Content-Type | application/json |
请求体
{
"model": "gemini-3-flash-preview",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Hello, world!" }
],
"stream": false,
"temperature": 0.7,
"max_tokens": 1024
}
参数
| 参数 | 类型 | 必填 | 描述 |
|---|---|---|---|
model | string | 是 | 模型 ID。参见支持的模型。 |
messages | array | 是 | 消息对象数组。 |
stream | boolean | 否 | false(默认)返回 JSON。true 启用 SSE 流式传输。 |
temperature | number | 否 | 采样温度,0 到 2。 |
top_p | number | 否 | 核采样,0 到 1。 |
max_tokens | integer | 否 | 最大生成 token 数。 |
stop | string | string[] | 否 | 最多 4 个停止序列。 |
n | integer | 否 | 生成的补全数量。 |
presence_penalty | number | 否 | -2.0 到 2.0。 |
frequency_penalty | number | 否 | -2.0 到 2.0。 |
user | string | 否 | 终端用户标识符,用于滥用检测。 |
消息对象
| 字段 | 类型 | 必填 | 描述 |
|---|---|---|---|
role | string | 是 | "system"、"user" 或 "assistant"。 |
content | string | array | 是 | 文本字符串,或多模态内容部分数组。 |
name | string | 否 | 参与者名称。 |
内容部分(多模态)
当 content 为数组时,每个元素是一个内容部分:
文本: { "type": "text", "text": "Describe this image." }
图像: { "type": "image_url", "image_url": { "url": "https://..." } }
音频: { "type": "input_audio", "input_audio": { "data": "<base64>", "format": "webm" } }
响应(非流式)
{
"id": "chatcmpl-abc123",
"object": "text_completion",
"created": 1709123456,
"model": "gemini-3-flash-preview",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 9,
"total_tokens": 21
}
}
响应字段
| 字段 | 类型 | 描述 |
|---|---|---|
id | string | 唯一补全 ID。 |
object | string | 始终为 "text_completion"。 |
created | integer | Unix 时间戳。 |
model | string | 使用的模型。 |
choices[].message | object | { role, content }。 |
choices[].finish_reason | string | "stop"、"length" 或 "content_filter"。 |
usage.prompt_tokens | integer | 输入 token 数。 |
usage.completion_tokens | integer | 输出 token 数。 |
usage.total_tokens | integer | 输入 + 输出总和。 |
响应(流式)
当 stream: true 时,响应为 Server-Sent Events 流。
Content-Type: text/event-stream
data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"text_completion.chunk","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":2,"total_tokens":14}}
data: [DONE]
最后一个数据块包含累积 token 计数的 usage 信息。
状态码
| 状态码 | 描述 |
|---|---|
200 | 成功 |
400 | 无效请求(缺少 model/messages,错误的模型类型) |
401 | 未授权 |
402 | 余额不足 |
502 | 上游供应商错误 |
示例
基本对话
curl -X POST https://api.modelmax.io/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $MODELMAX_API_KEY" \
-d '{
"model": "deepseek-v3.2",
"messages": [
{ "role": "user", "content": "What is 2+2?" }
]
}'
from openai import OpenAI
client = OpenAI(api_key="your-key", base_url="https://api.modelmax.io/v1")
response = client.chat.completions.create(
model="deepseek-v3.2",
messages=[{"role": "user", "content": "What is 2+2?"}],
)
print(response.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({ apiKey: "your-key", baseURL: "https://api.modelmax.io/v1" });
const response = await client.chat.completions.create({
model: "deepseek-v3.2",
messages: [{ role: "user", content: "What is 2+2?" }],
});
console.log(response.choices[0].message.content);
带视觉
curl -X POST https://api.modelmax.io/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $MODELMAX_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "What is in this image?" },
{ "type": "image_url", "image_url": { "url": "https://example.com/cat.jpg" } }
]
}
]
}'
response = client.chat.completions.create(
model="gemini-3-flash-preview",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/cat.jpg"}},
],
}
],
)
const response = await client.chat.completions.create({
model: "gemini-3-flash-preview",
messages: [
{
role: "user",
content: [
{ type: "text", text: "What is in this image?" },
{ type: "image_url", image_url: { url: "https://example.com/cat.jpg" } },
],
},
],
});
