Models Overview
Visual analysis of intelligence, pricing, and context windows across frontier AI models.
QUALITY vs PRICE
Higher quality, lower price = better value
Quality Score →
Price ($/1M tokens, blended) →
75
80
85
90
95
GPT-4o
Quality: 88.7 | $7.50
Quality: 88.7 | $7.50
GPT-4o mini
Quality: 82 | $0.26
Quality: 82 | $0.26
o1
Quality: 92.3 | $26.25
Quality: 92.3 | $26.25
o3-mini
Quality: 89.1 | $1.93
Quality: 89.1 | $1.93
Claude 3.5 Sonnet
Quality: 88.3 | $6.00
Quality: 88.3 | $6.00
Claude 3.5 Haiku
Quality: 81.5 | $1.60
Quality: 81.5 | $1.60
Claude 3 Opus
Quality: 86.8 | $30.00
Quality: 86.8 | $30.00
Gemini 1.5 Pro
Quality: 85.9 | $5.25
Quality: 85.9 | $5.25
Gemini 1.5 Flash
Quality: 78.9 | $0.13
Quality: 78.9 | $0.13
Gemini 2.0 Flash
Quality: 84.1 | $0.18
Quality: 84.1 | $0.18
DeepSeek-V3
Quality: 88.5 | $0.48
Quality: 88.5 | $0.48
DeepSeek-R1
Quality: 91.8 | $0.96
Quality: 91.8 | $0.96
Llama 3.1 405B
Quality: 88.6 | $2.70
Quality: 88.6 | $2.70
Llama 3.3 70B
Quality: 86.2 | $0.70
Quality: 86.2 | $0.70
Llama 3.1 8B
Quality: 73 | $0.10
Quality: 73 | $0.10
Mistral Large 2
Quality: 84 | $3.00
Quality: 84 | $3.00
Grok 2
Quality: 87.5 | $4.00
Quality: 87.5 | $4.00
OpenAI
Anthropic
Google
DeepSeek
Meta
Mistral
xAI
Context Window Comparison
Maximum context length measured in tokens for the top models.