curl api.using.ai
Docs
One endpoint. One key. Every model -- including thinking and agentic variants.
Quickstart
- Grab your API key from the dashboard at
app.using.ai/keys. - Export it as an environment variable:
export USING_AI_KEY=sk-.... - Send your first request to
https://api.using.ai/v1/chat/completions.
That's it -- no SDK install required. Official JS and Python clients are optional wrappers around the same REST endpoint.
export USING_AI_KEY=sk-live-...
curl https://api.using.ai/v1/chat/completions \
-H "Authorization: Bearer $USING_AI_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "haiku-4.5-agentic",
"messages": [{"role": "user", "content": "Say hello in one word."}]
}' Authentication
Every request needs an Authorization: Bearer header carrying your API key. Generate keys from the dashboard at app.using.ai/keys -- each key is scoped to your account's tier, so a Standard key can't call SuperMaxes models.
Keys don't expire, but you can revoke and rotate them anytime. Never ship a key inside client-side code -- proxy requests through your own backend.
curl https://api.using.ai/v1/chat/completions \
-H "Authorization: Bearer sk-live-xxxxxxxx" \
-H "Content-Type: application/json" Chat completions
All 15 models share one request shape. POST to /v1/chat/completions with a model field and a messages array -- swap the model name, everything else stays the same.
Responses return a choices array with the generated message, plus usage token counts for billing.
{
"model": "deepseek-r2",
"messages": [
{ "role": "system", "content": "You are a terse assistant." },
{ "role": "user", "content": "What's the time complexity of quicksort?" }
],
"temperature": 0.3
} Streaming
Set "stream": true to get server-sent events instead of a single JSON blob. Each event is a partial delta -- concatenate delta.content chunks as they arrive to render tokens as they're generated.
The stream ends with a data: [DONE] event.
const res = await fetch("https://api.using.ai/v1/chat/completions", {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.USING_AI_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "sonnet-4.5-thinking",
messages: [{ role: "user", content: "Stream me a haiku." }],
stream: true,
}),
});
const reader = res.body.getReader();
// read chunks, split on \n\n, parse each "data: {...}" line Rate limits
Rate limits are tied to your subscription tier: 60 req/min on Standard, 120 on Donator, 300 on SuperMaxes. Every response includes X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers so you can back off before hitting a 429.
See /pricing for the full breakdown of limits per tier.
< HTTP/1.1 200 OK
< X-RateLimit-Limit: 120
< X-RateLimit-Remaining: 117
< X-RateLimit-Reset: 42 Errors
Errors return a standard JSON body with an error.type and error.message. Common codes: 401 invalid or missing key, 403 model not included in your tier, 429 rate limit exceeded, 500 upstream model provider error.
Retry 429s and 500s with exponential backoff. Treat 401/403 as non-retryable.
{
"error": {
"type": "tier_forbidden",
"message": "opus-4.5 requires the SuperMaxes tier. Upgrade at using.ai/pricing."
}
}