Cerebras API Pricing 2026
Cerebras provides the fastest inference speeds using their wafer-scale chips, delivering instant responses.
Showing 7 models from Cerebras. Prices are per 1 million tokens. Data sourced from official pricing pages via LiteLLM.
Models
7
Cheapest Input
$0.10
/1M tokens
Cheapest Output
$0.10
/1M tokens
Max Context
131K
tokens
7 models
| Features | |||||
|---|---|---|---|---|---|
| llama3.1-8b | $0.100 | $0.100 | 128K | 128K | |
| gpt-oss-120b | $0.350 | $0.750 | 131.1K | 32.8K | |
| qwen-3-32b | $0.400 | $0.800 | 128K | 128K | |
| llama3.1-70b | $0.600 | $0.600 | 128K | 128K | |
| llama-3.3-70b | $0.850 | $1.20 | 128K | 128K | |
| zai-glm-4.6 | $2.25 | $2.75 | 128K | 128K | |
| zai-glm-4.7 | $2.25 | $2.75 | 128K | 128K |
Actualización Semanal de Precios LLM
Recibe notificaciones cuando cambien los precios de modelos IA. Gratis, sin spam, cancela cuando quieras.