Billing Explanation
AI inference is billed based on the number of input and output Tokens, and you can subscribe to our Plan to get more favorable prices.
To learn more about what a Token is, please refer to our documentation What is token.
Pricing Details
🎉 Currently, the AI inference service is in the trial period. The DeepSeek R1 & V3 models offer a shared monthly free quota of 1 million tokens, which resets at the end of each month. Any usage beyond this limit will be billed according to the prices listed in the table below.
| Model ID | Input Price (USD/1K Tokens) | Output Price (USD/1K Tokens) |
|---|---|---|
| deepseek-r1 | 0.00058 | 0.00229 |
| deepseek-v3 | 0.00029 | 0.00115 |
| deepseek-r1-32b | 0.00022 | 0.00086 |
| deepseek-r1-distill-32b | 0.00022 | 0.00086 |
| qwq-plus | 0.00029 | 0.00086 |
| qwq-32b | 0.00029 | 0.00086 |
| qwen-max-2025-01-25 | 0.00035 | 0.00138 |
| qwen2.5-72b-instruct | 0.00058 | 0.00172 |
| qwen2-72b-instruct | 0.00058 | 0.00172 |
| qwen2-vl-72b-instruct | 0.00229 | 0.00686 |
| nvidia/llama-3.1-nemotron-ultra-253b-v1 | Time-limited Free | Time-limited Free |
| nvidia/llama-3.3-nemotron-super-49b-v1 | Time-limited Free | Time-limited Free |