Skip to main content

Plans & Rate Limits

General Compute offers four plans to match your usage needs. All plans include access to every model.

Pay As You Go

No monthly fee. Add a card to enable auto-reload.
LimitValue
Requests per minute60
Input tokens per minute100,000
Output tokens per minute10,000
Tokens per day1,000,000
Max concurrent requests200
Max requests per day1,000

Developer — $50/mo

For growing applications with higher throughput needs.
LimitValue
Requests per minute300
Input tokens per minute500,000
Output tokens per minute100,000
Tokens per day10,000,000
Max concurrent requests500
Max requests per day5,000

Scale — $1,000/mo

For production workloads with high throughput.
LimitValue
Requests per minute1,000
Input tokens per minute2,000,000
Output tokens per minute500,000
Tokens per day50,000,000
Max concurrent requests10,000
Max requests per day100,000

Enterprise — Custom

For organizations that need custom limits, dedicated infrastructure, or SLAs. Contact us to discuss your needs.

Rate Limit Headers

Every API response includes headers to help you track your usage:
x-ratelimit-limit-requests: 300
x-ratelimit-remaining-requests: 299
x-ratelimit-reset-requests: 2024-01-01T00:00:00Z

Handling Rate Limits

When you exceed a rate limit, the API returns a 429 Too Many Requests response. We recommend implementing exponential backoff:
import GeneralCompute from "@generalcompute/sdk";

const client = new GeneralCompute();

// The SDK automatically retries on 429 errors with exponential backoff.
// You can configure the retry behavior:
const client = new GeneralCompute({
  maxRetries: 3, // default is 2
});
Need higher limits? Upgrade your plan or contact us for Enterprise pricing.