Compare AI Models

Sort by any column. Click a model name to see full details and pricing.

Model TierContext Input / 1M Output / 1MMMLU MultimodalLink
OpenAI o1
OpenAI
frontier200K$15.00$60.0092.3YesTry
Claude Opus 4.7
Anthropic
frontier200K$15.00$75.0091.8YesTry
Grok 4
xAI
frontier256K$3.00$15.0091YesTry
DeepSeek R1
DeepSeek
frontier128K$0.550$2.1990.8NoTry
Gemini 2.5 Pro
Google
frontier1.0M$1.25$10.0090YesTry
Claude Sonnet 4.6
Anthropic
frontier200K$3.00$15.0088.7YesTry
GPT-4o
OpenAI
frontier128K$2.50$10.0088.7YesTry
DeepSeek V4 Pro
DeepSeek
frontier128K$0.435$0.87088.5NoTry
Llama 4 Maverick
Meta
frontier1.0M$0.150$0.60087YesTry
Perplexity Pro
Perplexity
frontier127K$3.00$15.0087YesTry
Mistral Large 2
Mistral
frontier128K$2.00$6.0084NoTry
GPT-4o mini
OpenAI
budget128K$0.150$0.60082YesTry
Gemini 2.5 Flash
Google
budget1.0M$0.300$2.5082YesTry
Gemini 2.0 Flash
Google
budget1.0M$0.100$0.40076YesTry
Claude Haiku 4.5
Anthropic
budget200K$0.800$4.0073.8YesTry

Head-to-Head Comparisons

In-depth breakdowns across 10 real-world categories, with a clear winner for each.

ChatGPT vs Claude
GPT-4o vs Claude Sonnet 4.6
Claude wins

The two most popular AI assistants compared across coding, writing, math, and more.

Read comparison →
DeepSeek vs ChatGPT
DeepSeek V3 vs GPT-4o
Depends wins

The 9x cheaper disruptor vs the world's most popular AI — price, privacy, and performance.

Read comparison →
Gemini vs ChatGPT
Gemini 1.5 Pro vs GPT-4o
ChatGPT wins

Google vs OpenAI: 2M context window and lower price vs better coding and ecosystem.

Read comparison →
Claude vs Gemini
Claude Sonnet 4.6 vs Gemini 1.5 Pro
Claude wins

Anthropic vs Google — instruction following, context window, and cost compared.

Read comparison →
GPT-4o vs Gemini
GPT-4o vs Gemini 1.5 Pro
GPT-4o wins

OpenAI vs Google head-to-head: vision, video, pricing, and ecosystem integrations.

Read comparison →
Claude vs Copilot
Claude Sonnet 4.6 vs Microsoft Copilot
Depends wins

The developer favourite vs the enterprise standard — which fits your workflow?

Read comparison →
DeepSeek R1 vs o1
DeepSeek R1 vs OpenAI o1
R1 (cost) wins

The two best reasoning models compared. One is 27x cheaper and fully open-source.

Read comparison →
Claude vs Grok
Claude Sonnet 4.6 vs Grok 2
Claude wins

Anthropic's safety-first model vs xAI's real-time X/Twitter-connected AI.

Read comparison →
Perplexity vs ChatGPT
Perplexity Pro vs GPT-4o
Depends wins

AI search engine vs AI assistant — fundamentally different tools for different jobs.

Read comparison →
Llama vs ChatGPT
Llama 3.1 405B vs GPT-4o
Depends wins

Open-source vs proprietary: Llama wins on cost and privacy, GPT-4o wins on features.

Read comparison →
Copilot vs Gemini
Microsoft Copilot vs Gemini 1.5 Pro
Depends wins

Microsoft 365 vs Google Workspace — the answer depends entirely on your stack.

Read comparison →
Claude vs Mistral
Claude Sonnet 4.6 vs Mistral Large 2
Claude wins

Anthropic vs Europe's best AI — Claude leads on benchmarks, Mistral wins on EU compliance and cost.

Read comparison →
Grok vs Gemini
Grok 2 vs Gemini 1.5 Pro
Gemini wins

xAI's X-native model vs Google's Gemini — real-time social data vs a 2M context window.

Read comparison →
ChatGPT vs Copilot
GPT-4o (ChatGPT) vs Microsoft Copilot
ChatGPT wins

Both run on GPT-4o — but Copilot adds M365 integration and ChatGPT adds DALL-E and plugins.

Read comparison →
Mistral vs Llama
Mistral Large 2 vs Llama 3.1 405B
Depends wins

Europe's best vs Meta's open-source giant — both open-weight, different priorities.

Read comparison →
Grok vs ChatGPT
Grok 2 vs GPT-4o
ChatGPT wins

X's AI vs OpenAI's flagship — Grok has Twitter data, ChatGPT wins on features and benchmarks.

Read comparison →
GPT-4o vs o1
GPT-4o vs OpenAI o1
GPT-4o wins

OpenAI's two flagship models compared — fast & affordable vs slow & super-smart for hard reasoning.

Read comparison →
GPT-4 vs GPT-4o
Legacy GPT-4 vs GPT-4o
GPT-4o wins

Should you upgrade? GPT-4o is faster, 4x cheaper, and scores higher on every benchmark.

Read comparison →
Gemini Flash vs Pro
Gemini 1.5 Flash vs Gemini 1.5 Pro
Depends wins

Flash is 16x cheaper and faster. Pro has a 2M context window and better quality. Choose by task.

Read comparison →
Claude vs DeepSeek
Claude Sonnet 4.6 vs DeepSeek V3
Claude wins

Quality and safety vs 11x lower cost and open-source. Claude wins on capability, DeepSeek on price.

Read comparison →
Perplexity vs Claude
Perplexity Pro vs Claude Sonnet 4.6
Depends wins

AI search with live citations vs AI assistant for writing and coding. Fundamentally different tools.

Read comparison →

Quick Picks

Best Overall
Claude Sonnet 4.6

Top benchmark scores with excellent coding and instruction following. Strong choice for most tasks.

Best Budget
Gemini 1.5 Flash

Incredibly cheap at $0.075/1M input tokens with a massive 1M context window. Perfect for high-volume tasks.

Best Open Source
Llama 3.1 405B

Self-hostable and competitive with frontier models. Best for privacy-sensitive workloads.