Rate Limits

Rekall enforces rate limits per API key based on your plan tier. Limits apply to both requests per minute and memories created per month. All limits are tracked per-key, not per-account.

Limits by Tier

Tier	Requests/min	Memories/month	Search queries/min	Concurrent
Free	100	1,000	50	5
Pro	1,000	50,000	500	25
Team	5,000	500,000	2,500	100
Enterprise	Custom	Unlimited	Custom	Custom
Sandbox	50	100	25	2

Sandbox limits

Sandbox mode (rk_test_ keys) has the most restrictive limits. These are designed for testing, not production workloads. See the sandbox mode page for details.

Rate Limit Headers

Every API response includes rate limit information in the response headers. Use these to monitor your usage and avoid hitting limits.

Response Headers

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1705312800
X-RateLimit-Policy: 1000;w=60
Content-Type: application/json

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed in the current window.
`X-RateLimit-Remaining`	Requests remaining in the current window.
`X-RateLimit-Reset`	Unix timestamp when the current window resets.
`Retry-After`	Seconds to wait before retrying (only on 429 responses).

Burst Allowances

Each tier allows a short burst above the sustained rate limit. Bursts are measured over a 15-second sliding window, allowing temporary spikes without triggering 429 errors.

Tier	Sustained Limit	Burst Limit	Burst Window
Free	100 req/min	150 req/min (15s window)	15 seconds
Pro	1,000 req/min	1,500 req/min (15s window)	15 seconds
Team	5,000 req/min	7,500 req/min (15s window)	15 seconds
Enterprise	Custom req/min	Custom	15 seconds
Sandbox	50 req/min	75 req/min (15s window)	15 seconds

Handling 429 Responses

When you exceed the rate limit, the API returns a 429 status with a Retry-After header indicating how long to wait.

429 Response

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1705312800
Retry-After: 23
Content-Type: application/json

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "Rate limit exceeded. Retry after 23 seconds.",
    "status": 429,
    "details": {
      "limit": 1000,
      "window": "60s",
      "retry_after": 23
    },
    "request_id": "req_abc123def456"
  }
}

Exponential Backoff

Always implement exponential backoff for retries. Use the Retry-After header when available, and fall back to exponential delays.

Retry with Backoff

async function withRetry(fn, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error) {
      if (error.status !== 429 || attempt === maxRetries - 1) {
        throw error;
      }

      // Use Retry-After header if available
      const retryAfter = error.headers?.['retry-after'];
      const delay = retryAfter
        ? parseInt(retryAfter) * 1000
        : Math.min(1000 * Math.pow(2, attempt), 30000);

      await new Promise(resolve => setTimeout(resolve, delay));
    }
  }
}

SDK handles retries automatically

The official SDKs implement exponential backoff with configurable retries out of the box. Set maxRetries in the client options to control the behavior.

Memory Quotas

Monthly memory quotas limit the total number of memories created per billing cycle. Quotas reset on the first of each month at midnight UTC.

monitoring

Monitoring usage

Track your memory usage in real time via the dashboard or the GET /v1/usage endpoint.

notifications

Quota alerts

Configure usage alerts in the dashboard to receive notifications at 80% and 95% of your monthly quota. Available on Pro and above.

delete_sweep

Deleted memories do not count

Memories that are deleted during the billing cycle are subtracted from your usage count. Short-term memories that expire naturally are also not counted.