Billing Explanation

AI inference is billed based on the number of input and output Tokens, and you can subscribe to our Plan to get more favorable prices. To learn more about what a Token is, please refer to our documentation What is token.

Pricing Details

🎉 Currently, the AI inference service is in the trial period. The DeepSeek R1 & V3 models offer a shared monthly free quota of 1 million tokens, which resets at the end of each month. Any usage beyond this limit will be billed according to the prices listed in the table below.

Model ID	Input Price (USD/1K Tokens)	Output Price (USD/1K Tokens)
deepseek-r1	0.00058	0.00229
deepseek-v3	0.00029	0.00115
deepseek-r1-32b	0.00022	0.00086
deepseek-r1-distill-32b	0.00022	0.00086
qwq-plus	0.00029	0.00086
qwq-32b	0.00029	0.00086
qwen-max-2025-01-25	0.00035	0.00138
qwen2.5-72b-instruct	0.00058	0.00172
qwen2-72b-instruct	0.00058	0.00172
qwen2-vl-72b-instruct	0.00229	0.00686
nvidia/llama-3.1-nemotron-ultra-253b-v1	Time-limited Free	Time-limited Free
nvidia/llama-3.3-nemotron-super-49b-v1	Time-limited Free	Time-limited Free