NVIDIA (H100/B200)
The dominant AI training and inference GPU. CUDA ecosystem makes it the default for nearly all research and production.
PaidThe chips powering AI and today’s most capable models.
NVIDIA (H100/B200)
The dominant AI training and inference GPU. CUDA ecosystem makes it the default for nearly all research and production.
PaidAMD (MI300X)
NVIDIA’s closest competitor — ROCm stack, strong for inference, growing adoption.
PaidGoogle TPUs
Tensor Processing Units — Google’s custom ASICs powering Gemini and Google Cloud AI.
PaidAWS Trainium & Inferentia
Amazon’s custom training and inference chips for cost-efficient ML at scale.
PaidGroq LPU
Language Processing Unit — world’s fastest LLM inference chip (700+ tokens/sec).
PaidApple Silicon (M-series)
M1/M2/M3 unified memory architecture — excellent for running 7B–70B models locally.
PaidCerebras WSE
Wafer-Scale Engine — single-chip AI training, eliminates inter-chip communication bottlenecks.
PaidClaude 3.5 / 4 Sonnet
Anthropic’s flagship: 200K context, excellent coding & reasoning, safety-focused.
PaidGPT-4o
OpenAI’s multimodal flagship — text, images, audio in one model.
Freemiumo1 / o3 / o4-mini
OpenAI’s reasoning models — slow deliberate thinking for math, science, code.
PaidGemini 1.5 Pro / 2.0
Google’s long-context powerhouse — 1M+ tokens, multimodal, integrated with Google.
FreemiumDeepSeek R1
Open-weight reasoning model from China — rivals GPT-o1 at a fraction of the cost.
Free Open SourceMistral Large / 7B
French AI lab producing efficient open-weight models. Mistral 7B excellent for local use.
Recommended Open SourceGemma 2 / 3
Google’s lightweight open models designed for on-device and edge inference.
Open SourcePhi-4
Microsoft’s small but surprisingly capable open model — punches above its weight.
Open SourceLlama 3.3
Meta’s open-weight family — 8B to 405B. The most widely deployed open model.
Recommended Open Source