对话补全
对话补全端点兼容 OpenAI。如果你使用过 OpenAI API,你已经知道如何使用 ModelMax。
基本请求
curl -X POST https://api.modelmax.io/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $MODELMAX_API_KEY" \
-d '{
"model": "deepseek-v3.2",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "What is the capital of France?" }
]
}'
from openai import OpenAI
client = OpenAI(
api_key="your-api-key",
base_url="https://api.modelmax.io/v1",
)
response = client.chat.completions.create(
model="deepseek-v3.2",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"},
],
)
print(response.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "your-api-key",
baseURL: "https://api.modelmax.io/v1",
});
const response = await client.chat.completions.create({
model: "deepseek-v3.2",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "What is the capital of France?" },
],
});
console.log(response.choices[0].message.content);
流式传输
设置 stream: true 以通过 Server-Sent Events 逐步接收 token。这对于聊天界面非常有用,可以在生成时实时显示文本。
curl -X POST https://api.modelmax.io/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $MODELMAX_API_KEY" \
-d '{
"model": "deepseek-v3.2",
"messages": [
{ "role": "user", "content": "Tell me a short story." }
],
"stream": true
}'
stream = client.chat.completions.create(
model="deepseek-v3.2",
messages=[{"role": "user", "content": "Tell me a short story."}],
stream=True,
)
for chunk in stream:
content = chunk.choices[0].delta.content
if content:
print(content, end="", flush=True)
print()
const stream = await client.chat.completions.create({
model: "deepseek-v3.2",
messages: [{ role: "user", content: "Tell me a short story." }],
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) process.stdout.write(content);
}
console.log();
SSE 响应中每行的格式:
data: {"id":"...","choices":[{"delta":{"content":"Once"},"index":0}]}
data: {"id":"...","choices":[{"delta":{"content":" upon"},"index":0}]}
...
data: [DONE]
多轮对话
包含之前的消息以在多轮对话中保持上下文:
response = client.chat.completions.create(
model="gemini-3-flash-preview",
messages=[
{"role": "system", "content": "You are a math tutor."},
{"role": "user", "content": "What is 2+2?"},
{"role": "assistant", "content": "2 + 2 = 4."},
{"role": "user", "content": "And if you multiply that by 3?"},
],
)
参数
| 参数 | 类型 | 默认值 | 描述 |
|---|---|---|---|
model | string | — | 必填。 模型 ID(如 deepseek-v3.2) |
messages | array | — | 必填。 对话消息数组 |
stream | boolean | false | 启用 SSE 流式传输 |
temperature | number | 模型默认值 | 采样温度(0–2) |
top_p | number | 模型默认值 | 核采样阈值 |
max_tokens | integer | 模型默认值 | 最大生成 token 数 |
stop | string | array | null | 停止序列 |
切换模型
更改 model 参数即可在不同供应商之间切换。API 格式保持不变:
# AWS Bedrock 模型
client.chat.completions.create(model="deepseek-v3.2", messages=[...])
# Google Gemini 模型
client.chat.completions.create(model="gemini-3-flash-preview", messages=[...])
# 同一 API,不同供应商 — 无需修改代码。
查看支持的模型了解完整列表。
