AI on demand: Qwen/Qwen3.5-35B-A3B-FP8
Jump to navigation
Jump to search
Calling the model
# Set your personal key:
STONEY_KEY=sk-...
# Set the desired model:
MODEL=Qwen/Qwen3.5-35B-A3B-FP8
# Set your prompt:
PROMPT='Hello.'
# Set maximum amount of tokens:
MAX_TOKENS=100
curl https://llm.stoney-cloud.com/v1/chat/completions \
--silent --fail --show-error \
--header "Authorization: Bearer $STONEY_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "'"$MODEL"'",
"messages": [
{"role": "user", "content": "'"$PROMPT"'"}
],
"max_tokens": '"$MAX_TOKENS"'
}' \
| jq
Example output:
{
"id": "chatcmpl-a8634d242bc04923",
"object": "chat.completion",
"created": 1778156754,
"model": "Qwen/Qwen3.5-35B-A3B-FP8",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Thinking Process:\n\n1. **Analyze the Input:**\n * Input: \"Hello.\"\n * Context: This is a greeting.\n * Intent: The user is initiating a conversation.\n * Tone: Friendly, polite, neutral.\n\n2. **Determine the Appropriate Response:**\n * Acknowledge the greeting.\n Offer assistance.\n Keep it friendly and open-ended.\n\n3. **Drafting Options:**\n",
"refusal": null,
"annotations": null,
"audio": null,
"function_call": null,
"tool_calls": [],
"reasoning": null
},
"logprobs": null,
"finish_reason": "length",
"stop_reason": null,
"token_ids": null
}
],
"service_tier": null,
"system_fingerprint": null,
"usage": {
"prompt_tokens": 12,
"total_tokens": 112,
"completion_tokens": 100,
"prompt_tokens_details": null
},
"prompt_logprobs": null,
"prompt_token_ids": null,
"kv_transfer_params": null
}