Available Models
General Compute offers a wide range of open-source and open-weight models. All models are served via our OpenAI-compatible API.Recommended
These are our top picks for the best balance of quality, speed, and cost.MiniMax M2.7
Our best general-purpose model. Exceptional quality with a 192k context window at an unbeatable price. Perfect for production workloads.Model ID:
minimax-m2.7DeepSeek V3.2
State-of-the-art reasoning model. Best-in-class performance on complex tasks with built-in chain-of-thought reasoning.Model ID:
deepseek-v3.2All Models
| Model | Model ID | Context | Input / 1M tokens | Output / 1M tokens | Capabilities |
|---|---|---|---|---|---|
| MiniMax M2.7 | minimax-m2.7 | 192k | $0.28 | $1.20 | |
| DeepSeek V3.2 | deepseek-v3.2 | 32k | $0.25 | $0.38 | Reasoning |
| DeepSeek V3.1 | deepseek-v3.1 | 128k | $0.21 | $0.79 | Reasoning |
| GPT-OSS 120B | gpt-oss-120b | 128k | $0.21 | $0.79 |
Model Capabilities
- Reasoning — Models with built-in chain-of-thought reasoning. These models think step-by-step before producing a final answer, leading to significantly better results on complex tasks like math, code, and analysis.
Choosing a Model
| Use Case | Recommended Model | Why |
|---|---|---|
| General-purpose chat & generation | minimax-m2.7 | Best quality-to-cost ratio, 192k context |
| Complex reasoning & analysis | deepseek-v3.2 | State-of-the-art reasoning capabilities |
| Longer-context reasoning | deepseek-v3.1 | 128k context window for longer reasoning tasks |
| Large context windows | minimax-m2.7 | 192k context at low cost |
Custom checkpoints
Bring your own LoRA, GGUF, or full-finetuned checkpoints and run them on the same ultra-low latency infrastructure:- Share the model artifact (S3, Hugging Face, or direct upload) with the GeneralCompute team.
- We containerize the checkpoint, attach accelerators in
us-west-2, and expose it behind a private model ID (for exampleacme/my-custom-model). - Your private IDs behave exactly like any other
modelparameter in the OpenAI-compatible API, including streaming, tool calling, and function execution. - Enterprise plans layer on SLAs, dedicated pools, and per-org allow lists.
All prices are in USD. Pricing is based on token usage with no minimum commitment on the Pay As You Go plan.

