Skip to content

Rate Limits

Rate limits protect shared capacity and scale with account balance and model type.

When a request is rate-limited, check:

Retry-After: 10
LayerBehavior
IPBasic unauthenticated protection
API keyPer-key request rate
TenantAccount-level budget and concurrency
Local GPU modelsModel-specific queue slots

GET /v1/queue/status

Terminal window
curl https://api.parel.cloud/v1/queue/status \
-H "Authorization: Bearer pk-dev-YOUR_KEY"

The dashboard polls this route to show GPU readiness and queue depth.