Documentation Index
Fetch the complete documentation index at: https://docs.generalcompute.com/llms.txt
Use this file to discover all available pages before exploring further.
Available Models
General Compute offers a wide range of open-source and open-weight models. All models are served via our OpenAI-compatible API.Recommended
These are our top picks for the best balance of quality, speed, and cost.MiniMax M2.7
Our best general-purpose model. Exceptional quality with a massive 196k context window at an unbeatable price. Perfect for production workloads.Model ID:
minimax-m2.7DeepSeek V3.2
State-of-the-art reasoning model. Best-in-class performance on complex tasks with built-in chain-of-thought reasoning.Model ID:
deepseek-v3.2All Models
| Model | Model ID | Context | Input / 1M tokens | Output / 1M tokens | Capabilities |
|---|---|---|---|---|---|
| MiniMax M2.7 | minimax-m2.7 | 196k | $0.40 | $2.34 | |
| MiniMax M2.5 | minimax-m2.5 | 160k | $0.20 | $1.17 | |
| DeepSeek V3.2 | deepseek-v3.2 | 32k | $3.00 | $4.50 | Reasoning |
| DeepSeek V3.1 | deepseek-v3.1 | 128k | $3.00 | $4.50 | Reasoning |
| DeepSeek V3.1 CB | deepseek-v3.1-cb | 128k | $0.15 | $0.75 | Reasoning |
| Llama 3.3 70B | llama-3.3-70b | 128k | $0.60 | $1.20 | |
| Llama 4 Maverick 17B | llama-4-maverick-17b | 128k | $0.63 | $1.80 | Vision |
| GPT-OSS 120B | gpt-oss-120b | 128k | $0.21 | $0.79 | |
| Gemma 3 12B | gemma-3-12b-it | 131k | $0.04 | $0.13 | Vision |
Model Capabilities
- Reasoning — Models with built-in chain-of-thought reasoning. These models think step-by-step before producing a final answer, leading to significantly better results on complex tasks like math, code, and analysis.
- Vision — Models that accept image inputs alongside text. Pass images via the standard OpenAI-compatible
image_urlcontent type.
Choosing a Model
| Use Case | Recommended Model | Why |
|---|---|---|
| General-purpose chat & generation | minimax-m2.7 | Best quality-to-cost ratio, 196k context |
| Complex reasoning & analysis | deepseek-v3.2 | State-of-the-art reasoning capabilities |
| Budget-friendly reasoning | deepseek-v3.1-cb | Strong reasoning at $0.15/M input |
| Fast & cheap | gemma-3-12b-it | $0.04/M input, supports vision too |
| Vision tasks | llama-4-maverick-17b | Multimodal with strong image understanding |
| Large context windows | minimax-m2.7 | 196k context at low cost |
| Cost-optimized bulk processing | gemma-3-12b-it | $0.04/M input, supports vision too |
Custom checkpoints
Bring your own LoRA, GGUF, or full-finetuned checkpoints and run them on the same ultra-low latency infrastructure:- Share the model artifact (S3, Hugging Face, or direct upload) with the GeneralCompute team.
- We containerize the checkpoint, attach accelerators in
us-west-2, and expose it behind a private model ID (for exampleacme/my-custom-model). - Your private IDs behave exactly like any other
modelparameter in the OpenAI-compatible API, including streaming, tool calling, and function execution. - Enterprise plans layer on SLAs, dedicated pools, and per-org allow lists.
All prices are in USD. Pricing is based on token usage with no minimum commitment on the Pay As You Go plan.

