Rate Limits

Rate limits protect the API from abuse and ensure fair usage across all tenants. Every endpoint enforces a limit on how many requests a single user can make within a given time window.

Strategies

The API uses three different rate limiting strategies depending on the endpoint:

Sliding Window

Used by most endpoints. A sliding window tracks requests over a rolling time period.

If the limit is 10 requests per 60 seconds, each request is timestamped.
When a new request arrives, the system counts how many requests were made in the last 60 seconds.
If the count exceeds the limit, the request is rejected with a 429 status.

This provides a smoother experience compared to fixed windows — there is no "reset cliff" where all capacity is restored at once.

Token Bucket

Used by the telemetry endpoint for GPS data ingestion. The token bucket allows short bursts while enforcing a sustained rate.

Parameter	Global	Per-IMEI
Bucket capacity	180 tokens	90 tokens
Refill rate	30 tokens/sec	15 tokens/sec
Cost	1 token per 20 GPS points	1 token per 20 GPS points

In practice this means:

Sustained throughput: ~600 GPS points/second globally, ~300 points/second per device
Burst capacity: up to 3,600 GPS points in a single request (180 tokens × 20 points)
A batch of 100 GPS points costs 5 tokens. If the bucket is empty, the request is rejected until tokens refill

Used exclusively by the login endpoint to prevent credential stuffing attacks.

Limit: 5 attempts per 30-second window
Block duration: 60 seconds after exceeding the limit
During the block period, all login attempts from the same source are rejected immediately

danger

After 5 failed login attempts within 30 seconds, your account is blocked for 60 seconds. There is no way to bypass this block — you must wait for it to expire.

Rate Limit Policies by Endpoint Group

Group	Strategy	Limit	Window
Login	login-bruteforce	5 attempts	30s window, 60s block
Telemetry	token-bucket	180 capacity	30 refill/sec
Fleet — Devices	sliding-window	30 req	60 sec
Fleet — Drivers	sliding-window	30 req	60 sec
Reports — AVL	sliding-window	10 req	60 sec
Reports — GT Analytics	sliding-window	10 req	60 sec
Reports — GT Operations	sliding-window	10 req	60 sec
Reports — General	sliding-window	10 req	60 sec
Reports — CPM	sliding-window	10 req	60 sec
Accounts	sliding-window	10 req	60 sec
Clients	sliding-window	10 req	60 sec
Workflow	sliding-window	30 req	60 sec
Kanban	sliding-window	30 req	60 sec
Billing	sliding-window	20 req	60 sec
Portal Proveedores	sliding-window	20 req	60 sec

Response Headers

Every API response includes rate limit information in the headers:

Header	Description	Present on
`X-RateLimit-Limit`	Maximum requests allowed in the current window	All responses
`X-RateLimit-Remaining`	Requests remaining in the current window	All responses
`X-RateLimit-Reset`	Unix timestamp (seconds) when the window resets	All responses
`Retry-After`	Seconds to wait before retrying	`429` responses only

Reading headers proactively

Don't wait for a 429 to react — monitor X-RateLimit-Remaining on every response and throttle your requests before hitting the limit:

JavaScript
Python

const response = await fetch(
  `https://${TENANT}/apidev/v1/fleet/devices?limit=25`,
  { headers }
);

const remaining = parseInt(response.headers.get('X-RateLimit-Remaining'), 10);
const resetAt = parseInt(response.headers.get('X-RateLimit-Reset'), 10);

if (remaining <= 2) {
  const waitMs = (resetAt - Math.floor(Date.now() / 1000)) * 1000;
  console.warn(`Rate limit almost exhausted. ${remaining} left. Pausing ${waitMs}ms...`);
  await new Promise((resolve) => setTimeout(resolve, waitMs));
}

import time

response = requests.get(
    f"https://{TENANT}/apidev/v1/fleet/devices",
    headers=headers,
    params={"limit": 25},
)

remaining = int(response.headers.get("X-RateLimit-Remaining", 999))
reset_at = int(response.headers.get("X-RateLimit-Reset", 0))

if remaining <= 2:
    wait = max(0, reset_at - int(time.time()))
    print(f"Rate limit almost exhausted. {remaining} left. Pausing {wait}s...")
    time.sleep(wait)

Handling 429 Responses

When you receive a 429 Too Many Requests, use the Retry-After header to wait the exact required time. For a full exponential backoff implementation that handles both 429 and 500 errors, see the Retry Strategy in Error Handling.

Quick inline example:

# Check headers on any response
curl -s -D - \
  -H "Authorization: Bearer $TOKEN" \
  -H "X-API-Key: $APIKEY" \
  -H "tenant: $TENANT" \
  "https://$TENANT/apidev/v1/fleet/devices?limit=5" \
  | grep -i "x-ratelimit\|retry-after"

# X-RateLimit-Limit: 30
# X-RateLimit-Remaining: 28
# X-RateLimit-Reset: 1743782520

Scope and Concurrency

Rate limits are scoped per tenant, per user. Key rules:

Question	Answer
Do different users share a limit?	No. Each user has independent limits.
Can I use multiple API keys to bypass limits?	No. Limits are tied to the user, not the key.
Can I use multiple user accounts to parallelize?	Technically yes, but this is considered abuse and may result in tenant-level throttling. If you need higher throughput, contact your account manager to discuss a custom rate limit plan.
Do read and write operations share a limit?	Yes. GET and POST/PUT/DELETE to the same endpoint group share the same window.

Strategies​

Sliding Window​

Token Bucket​

Login Brute-Force​

Rate Limit Policies by Endpoint Group​

Response Headers​

Reading headers proactively​

Handling 429 Responses​

Scope and Concurrency​