Billing Explanation
AI inference is billed based on the number of input and output Tokens
, and you can subscribe to our Plan
to get more favorable prices.
To learn more about what a Token
is, please refer to our documentation What is token.
Pricing Details
🎉 Currently, the AI inference service is in the trial period. The DeepSeek R1 & V3 models offer a shared monthly free quota of 1 million tokens, which resets at the end of each month. Any usage beyond this limit will be billed according to the prices listed in the table below.
Model ID | Input Price (USD/1K Tokens) | Output Price (USD/1K Tokens) |
---|---|---|
deepseek-r1 | 0.00058 | 0.00229 |
deepseek-v3 | 0.00029 | 0.00115 |
deepseek-r1-32b | 0.00022 | 0.00086 |
deepseek-r1-distill-32b | 0.00022 | 0.00086 |
qwq-plus | 0.00029 | 0.00086 |
qwq-32b | 0.00029 | 0.00086 |
qwen-max-2025-01-25 | 0.00035 | 0.00138 |
qwen2.5-72b-instruct | 0.00058 | 0.00172 |
qwen2-72b-instruct | 0.00058 | 0.00172 |
qwen2-vl-72b-instruct | 0.00229 | 0.00686 |
nvidia/llama-3.1-nemotron-ultra-253b-v1 | Time-limited Free | Time-limited Free |
nvidia/llama-3.3-nemotron-super-49b-v1 | Time-limited Free | Time-limited Free |