Addis AI
Platform

Rate Limits

API usage quotas, context windows, and system constraints.

To ensure the reliability and stability of the Addis AI platform for all users, we enforce limits on the number of requests you can make over specific periods.

Tier Quotas

Limits are applied based on your organization's billing plan.

FeatureFree / SandboxPro / TeamEnterprise
RPM (Requests Per Minute)60500Custom
RPD (Requests Per Day)1,000UnlimitedUnlimited
TPM (Tokens Per Minute)40,000250,000Custom
Concurrency3 Requests50 RequestsCustom

Hitting Limits?

If you consistently hit these limits, please contact Sales to discuss an Enterprise plan with dedicated throughput.


Model Constraints

Apart from rate limits, each model has technical constraints regarding input size and duration.

Text Generation

Context Window8,192 Tokens
Max Output4,096 Tokens

Audio (TTS & STT)

Max Audio Size10 MB
Max Duration60 Seconds

Vision

Max Image Size10 MB
FormatsJPG, PNG, WEBP

Documents

Max PDF Size10 MB
Page Limit~20 Pages

Response Headers

Every API response includes HTTP headers that tell you your current status.

HeaderDescription
x-ratelimit-limit-requestsThe maximum number of requests allowed in the current window.
x-ratelimit-remaining-requestsThe number of requests remaining in the current window.
x-ratelimit-reset-requestsThe time (in seconds) until the window resets.

Handling Rate Limits (429)

If you exceed a limit, the API will return a 429 Too Many Requests status. Your application should handle this gracefully using Exponential Backoff.

Do not retry immediately in a tight loop. Wait, then retry with increasing delays.

async function fetchWithBackoff(url, options, retries = 3, delay = 1000) {
  try {
    const response = await fetch(url, options);

    // If rate limited, wait and retry
    if (response.status === 429 && retries > 0) {
      const resetTime = response.headers.get('x-ratelimit-reset-requests') || 1;
      const waitTime = Math.max(delay, resetTime * 1000);
      
      console.warn(`Rate limited. Retrying in ${waitTime}ms...`);
      await new Promise(resolve => setTimeout(resolve, waitTime));
      
      // Retry with double the delay (Exponential Backoff)
      return fetchWithBackoff(url, options, retries - 1, delay * 2);
    }

    return response;
  } catch (error) {
    throw error;
  }
}
import time
import requests

def request_with_backoff(url, headers, json_data, retries=3, delay=1):
    for i in range(retries + 1):
        response = requests.post(url, headers=headers, json=json_data)
        
        if response.status_code == 429:
            if i == retries:
                return response # Give up
            
            # Use header or default delay
            wait_time = int(response.headers.get('x-ratelimit-reset-requests', delay))
            print(f"Rate limit hit. Retrying in {wait_time}s...")
            
            time.sleep(wait_time)
            delay *= 2 # Exponential backoff
            continue
            
        return response

On this page