Groq API Pricing 2026

Groq provides ultra-fast LLM inference using custom LPU hardware. Fastest tokens-per-second in the industry.

Showing 11 models from Groq. Prices are per 1 million tokens. Data sourced from official pricing pages via LiteLLM.

Models

Cheapest Input

$0.05

/1M tokens

Cheapest Output

$0.08

/1M tokens

Max Context

262K

tokens

11 models


gemma-7b-it	$0.050	$0.080	8.2K	8.2K
llama-3.1-8b-instant	$0.050	$0.080	128K	8.2K
openai/gpt-oss-20b	$0.075	$0.300	131.1K	32.8K
openai/gpt-oss-safeguard-20b	$0.075	$0.300	131.1K	65.5K
meta-llama/llama-4-scout-17b-16e-instruct	$0.110	$0.340	131.1K	8.2K
openai/gpt-oss-120b	$0.150	$0.600	131.1K	32.8K
meta-llama/llama-4-maverick-17b-128e-instruct	$0.200	$0.600	131.1K	8.2K
meta-llama/llama-guard-4-12b	$0.200	$0.200	8.2K	8.2K
qwen/qwen3-32b	$0.290	$0.590	131K	131K
llama-3.3-70b-versatile	$0.590	$0.790	128K	32.8K
moonshotai/kimi-k2-instruct-0905	$1.00	$3.00	262.1K	16.4K

รับแจ้งเตือนเมื่อราคา AI model เปลี่ยน ฟรี ไม่สแปม ยกเลิกได้ตลอด