Compare AI Models
Sort by any column. Click a model name to see full details and pricing.
| Model | Tier | Context | Input / 1M | Output / 1M | MMLU | Multimodal | Link |
|---|---|---|---|---|---|---|---|
OpenAI o1 OpenAI | frontier | 200K | $15.00 | $60.00 | 92.3 | Yes | Try |
Claude Opus 4.7 Anthropic | frontier | 200K | $15.00 | $75.00 | 91.8 | Yes | Try |
Grok 4 xAI | frontier | 256K | $3.00 | $15.00 | 91 | Yes | Try |
DeepSeek R1 DeepSeek | frontier | 128K | $0.550 | $2.19 | 90.8 | No | Try |
Gemini 2.5 Pro Google | frontier | 1.0M | $1.25 | $10.00 | 90 | Yes | Try |
Claude Sonnet 4.6 Anthropic | frontier | 200K | $3.00 | $15.00 | 88.7 | Yes | Try |
GPT-4o OpenAI | frontier | 128K | $2.50 | $10.00 | 88.7 | Yes | Try |
DeepSeek V4 Pro DeepSeek | frontier | 128K | $0.435 | $0.870 | 88.5 | No | Try |
Llama 4 Maverick Meta | frontier | 1.0M | $0.150 | $0.600 | 87 | Yes | Try |
Perplexity Pro Perplexity | frontier | 127K | $3.00 | $15.00 | 87 | Yes | Try |
Mistral Large 2 Mistral | frontier | 128K | $2.00 | $6.00 | 84 | No | Try |
GPT-4o mini OpenAI | budget | 128K | $0.150 | $0.600 | 82 | Yes | Try |
Gemini 2.5 Flash Google | budget | 1.0M | $0.300 | $2.50 | 82 | Yes | Try |
Gemini 2.0 Flash Google | budget | 1.0M | $0.100 | $0.400 | 76 | Yes | Try |
Claude Haiku 4.5 Anthropic | budget | 200K | $0.800 | $4.00 | 73.8 | Yes | Try |
Head-to-Head Comparisons
In-depth breakdowns across 10 real-world categories, with a clear winner for each.
The two most popular AI assistants compared across coding, writing, math, and more.
The 9x cheaper disruptor vs the world's most popular AI — price, privacy, and performance.
Google vs OpenAI: 2M context window and lower price vs better coding and ecosystem.
Anthropic vs Google — instruction following, context window, and cost compared.
OpenAI vs Google head-to-head: vision, video, pricing, and ecosystem integrations.
The developer favourite vs the enterprise standard — which fits your workflow?
The two best reasoning models compared. One is 27x cheaper and fully open-source.
Anthropic's safety-first model vs xAI's real-time X/Twitter-connected AI.
AI search engine vs AI assistant — fundamentally different tools for different jobs.
Open-source vs proprietary: Llama wins on cost and privacy, GPT-4o wins on features.
Microsoft 365 vs Google Workspace — the answer depends entirely on your stack.
Anthropic vs Europe's best AI — Claude leads on benchmarks, Mistral wins on EU compliance and cost.
xAI's X-native model vs Google's Gemini — real-time social data vs a 2M context window.
Both run on GPT-4o — but Copilot adds M365 integration and ChatGPT adds DALL-E and plugins.
Europe's best vs Meta's open-source giant — both open-weight, different priorities.
X's AI vs OpenAI's flagship — Grok has Twitter data, ChatGPT wins on features and benchmarks.
OpenAI's two flagship models compared — fast & affordable vs slow & super-smart for hard reasoning.
Should you upgrade? GPT-4o is faster, 4x cheaper, and scores higher on every benchmark.
Flash is 16x cheaper and faster. Pro has a 2M context window and better quality. Choose by task.
Quality and safety vs 11x lower cost and open-source. Claude wins on capability, DeepSeek on price.
AI search with live citations vs AI assistant for writing and coding. Fundamentally different tools.
Quick Picks
Top benchmark scores with excellent coding and instruction following. Strong choice for most tasks.
Incredibly cheap at $0.075/1M input tokens with a massive 1M context window. Perfect for high-volume tasks.
Self-hostable and competitive with frontier models. Best for privacy-sensitive workloads.