对话补全

对话补全端点兼容 OpenAI。如果你使用过 OpenAI API，你已经知道如何使用 ModelMax。

基本请求

curl -X POST https://api.modelmax.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MODELMAX_API_KEY" \
  -d '{
    "model": "deepseek-v3.2",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "What is the capital of France?" }
    ]
  }'

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key",
    base_url="https://api.modelmax.io/v1",
)

response = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"},
    ],
)

print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your-api-key",
  baseURL: "https://api.modelmax.io/v1",
});

const response = await client.chat.completions.create({
  model: "deepseek-v3.2",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is the capital of France?" },
  ],
});

console.log(response.choices[0].message.content);

流式传输

设置 stream: true 以通过 Server-Sent Events 逐步接收 token。这对于聊天界面非常有用，可以在生成时实时显示文本。

curl -X POST https://api.modelmax.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MODELMAX_API_KEY" \
  -d '{
    "model": "deepseek-v3.2",
    "messages": [
      { "role": "user", "content": "Tell me a short story." }
    ],
    "stream": true
  }'

stream = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=[{"role": "user", "content": "Tell me a short story."}],
    stream=True,
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)
print()

const stream = await client.chat.completions.create({
  model: "deepseek-v3.2",
  messages: [{ role: "user", content: "Tell me a short story." }],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}
console.log();

SSE 响应中每行的格式：

data: {"id":"...","choices":[{"delta":{"content":"Once"},"index":0}]}
data: {"id":"...","choices":[{"delta":{"content":" upon"},"index":0}]}
...
data: [DONE]

多轮对话

包含之前的消息以在多轮对话中保持上下文：

response = client.chat.completions.create(
    model="gemini-3-flash-preview",
    messages=[
        {"role": "system", "content": "You are a math tutor."},
        {"role": "user", "content": "What is 2+2?"},
        {"role": "assistant", "content": "2 + 2 = 4."},
        {"role": "user", "content": "And if you multiply that by 3?"},
    ],
)

参数

参数	类型	默认值	描述
`model`	`string`	—	必填。模型 ID（如 `deepseek-v3.2`）
`messages`	`array`	—	必填。对话消息数组
`stream`	`boolean`	`false`	启用 SSE 流式传输
`temperature`	`number`	模型默认值	采样温度（0–2）
`top_p`	`number`	模型默认值	核采样阈值
`max_tokens`	`integer`	模型默认值	最大生成 token 数
`stop`	`string \| array`	`null`	停止序列

切换模型

更改 model 参数即可在不同供应商之间切换。API 格式保持不变：

# AWS Bedrock 模型
client.chat.completions.create(model="deepseek-v3.2", messages=[...])

# Google Gemini 模型
client.chat.completions.create(model="gemini-3-flash-preview", messages=[...])

# 同一 API，不同供应商 — 无需修改代码。

查看支持的模型了解完整列表。