Rate Limits
Rekall enforces rate limits per API key based on your plan tier. Limits apply to both requests per minute and memories created per month. All limits are tracked per-key, not per-account.
Limits by Tier
| Tier | Requests/min | Memories/month | Search queries/min | Concurrent |
|---|---|---|---|---|
| Free | 100 | 1,000 | 50 | 5 |
| Pro | 1,000 | 50,000 | 500 | 25 |
| Team | 5,000 | 500,000 | 2,500 | 100 |
| Enterprise | Custom | Unlimited | Custom | Custom |
| Sandbox | 50 | 100 | 25 | 2 |
Sandbox limits
Sandbox mode (rk_test_ keys) has the most restrictive limits. These are designed for testing, not production workloads. See the sandbox mode page for details.
Rate Limit Headers
Every API response includes rate limit information in the response headers. Use these to monitor your usage and avoid hitting limits.
HTTP/1.1 200 OKX-RateLimit-Limit: 1000X-RateLimit-Remaining: 847X-RateLimit-Reset: 1705312800X-RateLimit-Policy: 1000;w=60Content-Type: application/json
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the current window. |
X-RateLimit-Remaining | Requests remaining in the current window. |
X-RateLimit-Reset | Unix timestamp when the current window resets. |
Retry-After | Seconds to wait before retrying (only on 429 responses). |
Burst Allowances
Each tier allows a short burst above the sustained rate limit. Bursts are measured over a 15-second sliding window, allowing temporary spikes without triggering 429 errors.
| Tier | Sustained Limit | Burst Limit | Burst Window |
|---|---|---|---|
| Free | 100 req/min | 150 req/min (15s window) | 15 seconds |
| Pro | 1,000 req/min | 1,500 req/min (15s window) | 15 seconds |
| Team | 5,000 req/min | 7,500 req/min (15s window) | 15 seconds |
| Enterprise | Custom req/min | Custom | 15 seconds |
| Sandbox | 50 req/min | 75 req/min (15s window) | 15 seconds |
Handling 429 Responses
When you exceed the rate limit, the API returns a 429 status with a Retry-After header indicating how long to wait.
HTTP/1.1 429 Too Many RequestsX-RateLimit-Limit: 1000X-RateLimit-Remaining: 0X-RateLimit-Reset: 1705312800Retry-After: 23Content-Type: application/json{"error": {"code": "RATE_LIMIT_EXCEEDED","message": "Rate limit exceeded. Retry after 23 seconds.","status": 429,"details": {"limit": 1000,"window": "60s","retry_after": 23},"request_id": "req_abc123def456"}}
Exponential Backoff
Always implement exponential backoff for retries. Use the Retry-After header when available, and fall back to exponential delays.
async function withRetry(fn, maxRetries = 5) {for (let attempt = 0; attempt < maxRetries; attempt++) {try {return await fn();} catch (error) {if (error.status !== 429 || attempt === maxRetries - 1) {throw error;}// Use Retry-After header if availableconst retryAfter = error.headers?.['retry-after'];const delay = retryAfter? parseInt(retryAfter) * 1000: Math.min(1000 * Math.pow(2, attempt), 30000);await new Promise(resolve => setTimeout(resolve, delay));}}}
SDK handles retries automatically
The official SDKs implement exponential backoff with configurable retries out of the box. Set maxRetries in the client options to control the behavior.
Memory Quotas
Monthly memory quotas limit the total number of memories created per billing cycle. Quotas reset on the first of each month at midnight UTC.
Monitoring usage
Track your memory usage in real time via the dashboard or the GET /v1/usage endpoint.
Quota alerts
Configure usage alerts in the dashboard to receive notifications at 80% and 95% of your monthly quota. Available on Pro and above.
Deleted memories do not count
Memories that are deleted during the billing cycle are subtracted from your usage count. Short-term memories that expire naturally are also not counted.
