Rate Limits - General Compute

Plans & Rate Limits

General Compute offers four plans to match your usage needs. All plans include access to every model.

Pay As You Go

No monthly fee. Add a card to enable auto-reload.

Limit	Value
Requests per minute	100
Tokens per minute	200,000
Requests per day	50,000
Tokens per day	10,000,000

Developer — $50/mo

For growing applications with higher throughput needs.

Limit	Value
Requests per minute	500
Tokens per minute	1,000,000
Requests per day	250,000
Tokens per day	100,000,000

Scale — $1,000/mo

For production workloads with high throughput.

Limit	Value
Requests per minute	2,000
Tokens per minute	5,000,000
Requests per day	1,000,000
Tokens per day	500,000,000

Enterprise — Custom

For organizations that need custom limits, dedicated infrastructure, or SLAs. Contact us to discuss your needs.

Rate Limit Headers

Every API response includes headers to help you track your usage:

x-ratelimit-limit-requests: 500
x-ratelimit-remaining-requests: 299
x-ratelimit-reset-requests: 2024-01-01T00:00:00Z

Handling Rate Limits

When you exceed a rate limit, the API returns a 429 Too Many Requests response. We recommend implementing exponential backoff:

import GeneralCompute from "@generalcompute/sdk";

const client = new GeneralCompute();

// The SDK automatically retries on 429 errors with exponential backoff.
// You can configure the retry behavior:
const client = new GeneralCompute({
  maxRetries: 3, // default is 2
});

Need higher limits? Upgrade your plan or contact us for Enterprise pricing.

Models & Pricing Capabilities

​Plans & Rate Limits

​Pay As You Go

​Developer — $50/mo

​Scale — $1,000/mo

​Enterprise — Custom

​Rate Limit Headers

​Handling Rate Limits

Plans & Rate Limits

Pay As You Go

Developer — $50/mo

Scale — $1,000/mo

Enterprise — Custom

Rate Limit Headers

Handling Rate Limits