Billing

ModelMax charges based on actual usage — no subscriptions, no minimums. You only pay for what you use, with zero markup on inference costs.

How billing works

Top up your account balance via the Dashboard.
Use the API — each request deducts from your balance.
Monitor usage in the Dashboard under Usage.

Pricing model

Chat completions

Charged per token, matching the upstream provider's pricing. Billing happens:

Non-streaming: After the full response is generated.
Streaming: After the stream completes (final chunk with usage data).

The usage object in the response shows the exact token breakdown:

{
  "usage": {
    "prompt_tokens": 150,
    "completion_tokens": 89,
    "total_tokens": 239
  }
}

Image generation

Charged per image generated, based on the provider's pricing for the requested size and quality.

Video generation

Charged based on the generated video's properties:

Factor	Description
Duration	Number of seconds (e.g., 8s)
Resolution	`720p`, `1080p`, or `4k`
Audio	Whether audio track is included

Billing is triggered only when you call the Queue Result endpoint and the status is COMPLETED. The lightweight status endpoint does not trigger billing.

{
  "usage": {
    "video_seconds": 8,
    "video_resolution": "720p",
    "video_has_audio": true
  }
}

Balance check

Before every request, ModelMax checks your balance. If insufficient:

HTTP 402 Payment Required

{
  "error": {
    "message": "insufficient balance",
    "type": "insufficient_balance"
  }
}

No charges are incurred for rejected requests.

Monitoring usage

View your usage breakdown in the Dashboard:

Dashboard → Usage — Daily usage by model
Dashboard → Stats — Spending summary and balance

Usage data is aggregated daily per model.