Billing
ModelMax charges based on actual usage — no subscriptions, no minimums. You only pay for what you use, with zero markup on inference costs.
How billing works
- Top up your account balance via the Dashboard.
- Use the API — each request deducts from your balance.
- Monitor usage in the Dashboard under Usage.
Pricing model
Chat completions
Charged per token, matching the upstream provider's pricing. Billing happens:
- Non-streaming: After the full response is generated.
- Streaming: After the stream completes (final chunk with usage data).
The usage object in the response shows the exact token breakdown:
{
"usage": {
"prompt_tokens": 150,
"completion_tokens": 89,
"total_tokens": 239
}
}
Image generation
Charged per image generated, based on the provider's pricing for the requested size and quality.
Video generation
Charged based on the generated video's properties:
| Factor | Description |
|---|---|
| Duration | Number of seconds (e.g., 8s) |
| Resolution | 720p, 1080p, or 4k |
| Audio | Whether audio track is included |
Billing is triggered only when you call the Queue Result endpoint and the status is COMPLETED. The lightweight status endpoint does not trigger billing.
{
"usage": {
"video_seconds": 8,
"video_resolution": "720p",
"video_has_audio": true
}
}
Balance check
Before every request, ModelMax checks your balance. If insufficient:
HTTP 402 Payment Required
{
"error": {
"message": "insufficient balance",
"type": "insufficient_balance"
}
}
No charges are incurred for rejected requests.
Monitoring usage
View your usage breakdown in the Dashboard:
- Dashboard → Usage — Daily usage by model
- Dashboard → Stats — Spending summary and balance
Usage data is aggregated daily per model.
