Rate Limits
Rate limits protect the API from abuse and ensure fair usage across all tenants. Every endpoint enforces a limit on how many requests a single user can make within a given time window.
Strategies
The API uses three different rate limiting strategies depending on the endpoint:
Sliding Window
Used by most endpoints. A sliding window tracks requests over a rolling time period.
- If the limit is 10 requests per 60 seconds, each request is timestamped.
- When a new request arrives, the system counts how many requests were made in the last 60 seconds.
- If the count exceeds the limit, the request is rejected with a
429status.
This provides a smoother experience compared to fixed windows — there is no "reset cliff" where all capacity is restored at once.
Token Bucket
Used by the telemetry endpoint for GPS data ingestion. The token bucket allows short bursts while enforcing a sustained rate.
| Parameter | Global | Per-IMEI |
|---|---|---|
| Bucket capacity | 180 tokens | 90 tokens |
| Refill rate | 30 tokens/sec | 15 tokens/sec |
| Cost | 1 token per 20 GPS points | 1 token per 20 GPS points |
In practice this means:
- Sustained throughput: ~600 GPS points/second globally, ~300 points/second per device
- Burst capacity: up to 3,600 GPS points in a single request (180 tokens × 20 points)
- A batch of 100 GPS points costs 5 tokens. If the bucket is empty, the request is rejected until tokens refill
Login Brute-Force
Used exclusively by the login endpoint to prevent credential stuffing attacks.
- Limit: 5 attempts per 30-second window
- Block duration: 60 seconds after exceeding the limit
- During the block period, all login attempts from the same source are rejected immediately
After 5 failed login attempts within 30 seconds, your account is blocked for 60 seconds. There is no way to bypass this block — you must wait for it to expire.
Rate Limit Policies by Endpoint Group
| Group | Strategy | Limit | Window |
|---|---|---|---|
| Login | login-bruteforce | 5 attempts | 30s window, 60s block |
| Telemetry | token-bucket | 180 capacity | 30 refill/sec |
| Fleet — Devices | sliding-window | 30 req | 60 sec |
| Fleet — Drivers | sliding-window | 30 req | 60 sec |
| Reports — AVL | sliding-window | 10 req | 60 sec |
| Reports — GT Analytics | sliding-window | 10 req | 60 sec |
| Reports — GT Operations | sliding-window | 10 req | 60 sec |
| Reports — General | sliding-window | 10 req | 60 sec |
| Reports — CPM | sliding-window | 10 req | 60 sec |
| Accounts | sliding-window | 10 req | 60 sec |
| Clients | sliding-window | 10 req | 60 sec |
| Workflow | sliding-window | 30 req | 60 sec |
| Kanban | sliding-window | 30 req | 60 sec |
| Billing | sliding-window | 20 req | 60 sec |
| Portal Proveedores | sliding-window | 20 req | 60 sec |
Response Headers
Every API response includes rate limit information in the headers:
| Header | Description | Present on |
|---|---|---|
X-RateLimit-Limit | Maximum requests allowed in the current window | All responses |
X-RateLimit-Remaining | Requests remaining in the current window | All responses |
X-RateLimit-Reset | Unix timestamp (seconds) when the window resets | All responses |
Retry-After | Seconds to wait before retrying | 429 responses only |
Reading headers proactively
Don't wait for a 429 to react — monitor X-RateLimit-Remaining on every response and throttle your requests before hitting the limit:
- JavaScript
- Python
const response = await fetch(
`https://${TENANT}/apidev/v1/fleet/devices?limit=25`,
{ headers }
);
const remaining = parseInt(response.headers.get('X-RateLimit-Remaining'), 10);
const resetAt = parseInt(response.headers.get('X-RateLimit-Reset'), 10);
if (remaining <= 2) {
const waitMs = (resetAt - Math.floor(Date.now() / 1000)) * 1000;
console.warn(`Rate limit almost exhausted. ${remaining} left. Pausing ${waitMs}ms...`);
await new Promise((resolve) => setTimeout(resolve, waitMs));
}
import time
response = requests.get(
f"https://{TENANT}/apidev/v1/fleet/devices",
headers=headers,
params={"limit": 25},
)
remaining = int(response.headers.get("X-RateLimit-Remaining", 999))
reset_at = int(response.headers.get("X-RateLimit-Reset", 0))
if remaining <= 2:
wait = max(0, reset_at - int(time.time()))
print(f"Rate limit almost exhausted. {remaining} left. Pausing {wait}s...")
time.sleep(wait)
Handling 429 Responses
When you receive a 429 Too Many Requests, use the Retry-After header to wait the exact required time. For a full exponential backoff implementation that handles both 429 and 500 errors, see the Retry Strategy in Error Handling.
Quick inline example:
# Check headers on any response
curl -s -D - \
-H "Authorization: Bearer $TOKEN" \
-H "X-API-Key: $APIKEY" \
-H "tenant: $TENANT" \
"https://$TENANT/apidev/v1/fleet/devices?limit=5" \
| grep -i "x-ratelimit\|retry-after"
# X-RateLimit-Limit: 30
# X-RateLimit-Remaining: 28
# X-RateLimit-Reset: 1743782520
Scope and Concurrency
Rate limits are scoped per tenant, per user. Key rules:
| Question | Answer |
|---|---|
| Do different users share a limit? | No. Each user has independent limits. |
| Can I use multiple API keys to bypass limits? | No. Limits are tied to the user, not the key. |
| Can I use multiple user accounts to parallelize? | Technically yes, but this is considered abuse and may result in tenant-level throttling. If you need higher throughput, contact your account manager to discuss a custom rate limit plan. |
| Do read and write operations share a limit? | Yes. GET and POST/PUT/DELETE to the same endpoint group share the same window. |