Quick Start
This guide will help you configure and use Sufy's distribution service in just a few minutes. By following simple steps, you can enable global acceleration for your resources.
Prerequisites
Before you begin, ensure that:
- You have registered and logged into your Sufy account
- You have obtained your AccessKey
Using the API Token Service
Step 1: Obtain Your API Token
In this step, you need to use the AccessKey obtained in the prerequisites to exchange for an API Token
via an API call. The API Token
is used for authentication in inference API requests. Please keep it secure. Do not share your API key with others or expose it in browsers or other client-side code.
curl https://api.qnaigc.com/api/llmapikey -H "Authorization: <your access key>"
# output
{"api_key":"sk-xxxxx","old_key":"sk-xxxxx","status":true}%
Step 2: Perform Inference Testing
The following examples use Python
code for demonstration.
Streaming Call
from openai import OpenAI
url = 'https://api.qnaigc.com/v1/'
llm_api_key = 'your llm_api_key'
client = OpenAI(
base_url=url,
api_key=llm_api_key
)
# Send a request with streaming output
content = ""
messages = [
{"role": "user", "content": "What scenarios can Sufy GPU cloud products be used for?"}
]
response = client.chat.completions.create(
model="deepseek-v3",
messages=messages,
stream=True, # Enable streaming output
max_tokens=4096
)
# Gradually receive and process the response
for chunk in response:
if chunk.choices[0].delta.content:
content += chunk.choices[0].delta.content
print(content)
# Round 2
messages.append({"role": "assistant", "content": content})
messages.append({'role': 'user', 'content': "Continue"})
response = client.chat.completions.create(
model="deepseek-v3",
messages=messages,
stream=True
)
for chunk in response:
if chunk.choices[0].delta.content:
content += chunk.choices[0].delta.content
print(content)
Non-Streaming Call
from openai import OpenAI
url = 'https://api.qnaigc.com/v1/'
llm_api_key = 'your llm_api_key'
client = OpenAI(
base_url=url,
api_key=llm_api_key
)
# Send a request for non-streaming output
messages = [
{"role": "user", "content": "What scenarios can Sufy GPU cloud products be used for?"}
]
response = client.chat.completions.create(
model="deepseek-v3",
messages=messages,
stream=False,
max_tokens=4096
)
content = response.choices[0].message.content
print(content)
# Round 2
messages.append({"role": "assistant", "content": content})
messages.append({'role': 'user', 'content': "Continue"})
response = client.chat.completions.create(
model="deepseek-v3",
messages=messages,
stream=False
)
content = response.choices[0].message.content
print(content)
Step 3: Try the Internet Access Feature
The API supports internet access. To maintain compatibility with OpenAI
, the internet access feature is enabled by appending ?search
to the model name. For example:
# Call the text summarization API
export LLM_API_KEY="<your LLM API KEY>"
curl https://api.qnaigc.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LLM_API_KEY" \
-d '{
"messages": [{"role": "user", "content": "What scenarios can Sufy GPU cloud products be used for?"}],
"model": "deepseek-v3?search"
}'
Step 4: Add Image Recognition
The API now supports image content recognition. Example code is as follows:
# Call the text summarization API
export LLM_API_KEY="<your LLM API KEY>"
curl https://api.qnaigc.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LLM_API_KEY" \
-d '{
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Please help me organize the content in the image and prepare a report for me."
},
{
"type": "image_url",
"image_url": {
"url": "https://your_image_url"
}
}
]
}
],
"model": "deepseek-v3"
}'
- Note: The file size pointed to by image_url must not exceed 8MB.