Rate Limits
Rate limits protect shared capacity and scale with account balance and model type.
Headers
Section titled “Headers”When a request is rate-limited, check:
Retry-After: 10Typical behavior
Section titled “Typical behavior”| Layer | Behavior |
|---|---|
| IP | Basic unauthenticated protection |
| API key | Per-key request rate |
| Tenant | Account-level budget and concurrency |
| Local GPU models | Model-specific queue slots |
Queue endpoint
Section titled “Queue endpoint”GET /v1/queue/status
curl https://api.parel.cloud/v1/queue/status \ -H "Authorization: Bearer pk-dev-YOUR_KEY"The dashboard polls this route to show GPU readiness and queue depth.