Skip to content
using.ai

curl api.using.ai

Docs

One endpoint. One key. Every model -- including thinking and agentic variants.

Quickstart

  1. Grab your API key from the dashboard at app.using.ai/keys.
  2. Export it as an environment variable: export USING_AI_KEY=sk-....
  3. Send your first request to https://api.using.ai/v1/chat/completions.

That's it -- no SDK install required. Official JS and Python clients are optional wrappers around the same REST endpoint.

bash
export USING_AI_KEY=sk-live-...

curl https://api.using.ai/v1/chat/completions \
  -H "Authorization: Bearer $USING_AI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "haiku-4.5-agentic",
    "messages": [{"role": "user", "content": "Say hello in one word."}]
  }'

Authentication

Every request needs an Authorization: Bearer header carrying your API key. Generate keys from the dashboard at app.using.ai/keys -- each key is scoped to your account's tier, so a Standard key can't call SuperMaxes models.

Keys don't expire, but you can revoke and rotate them anytime. Never ship a key inside client-side code -- proxy requests through your own backend.

bash
curl https://api.using.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-live-xxxxxxxx" \
  -H "Content-Type: application/json"

Chat completions

All 15 models share one request shape. POST to /v1/chat/completions with a model field and a messages array -- swap the model name, everything else stays the same.

Responses return a choices array with the generated message, plus usage token counts for billing.

json
{
  "model": "deepseek-r2",
  "messages": [
    { "role": "system", "content": "You are a terse assistant." },
    { "role": "user", "content": "What's the time complexity of quicksort?" }
  ],
  "temperature": 0.3
}

Streaming

Set "stream": true to get server-sent events instead of a single JSON blob. Each event is a partial delta -- concatenate delta.content chunks as they arrive to render tokens as they're generated.

The stream ends with a data: [DONE] event.

javascript
const res = await fetch("https://api.using.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.USING_AI_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "sonnet-4.5-thinking",
    messages: [{ role: "user", content: "Stream me a haiku." }],
    stream: true,
  }),
});

const reader = res.body.getReader();
// read chunks, split on \n\n, parse each "data: {...}" line

Rate limits

Rate limits are tied to your subscription tier: 60 req/min on Standard, 120 on Donator, 300 on SuperMaxes. Every response includes X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers so you can back off before hitting a 429.

See /pricing for the full breakdown of limits per tier.

http
< HTTP/1.1 200 OK
< X-RateLimit-Limit: 120
< X-RateLimit-Remaining: 117
< X-RateLimit-Reset: 42

Errors

Errors return a standard JSON body with an error.type and error.message. Common codes: 401 invalid or missing key, 403 model not included in your tier, 429 rate limit exceeded, 500 upstream model provider error.

Retry 429s and 500s with exponential backoff. Treat 401/403 as non-retryable.

json
{
  "error": {
    "type": "tier_forbidden",
    "message": "opus-4.5 requires the SuperMaxes tier. Upgrade at using.ai/pricing."
  }
}

Need a key? Start a free trial →

Website built with pressless