Live Leaderboard

Compare over 100 AI Models

Ranking the performance of top LLMs from OpenAI, Google, DeepSeek & others across intelligence, price, and speed.

HIGHLIGHTS

🏆
#1 Quality
o1
OpenAI
92.3score
Fastest
Llama 3.1 8B
Meta
200t/s
💰
Most Affordable
Llama 3.1 8B
Meta
$0.10/1M tokens
🔓
Best Open Source
DeepSeek-R1
DeepSeek
91.8score

QUALITY vs PRICE

Higher quality, lower price = better value

Quality Score →
Price ($/1M tokens, blended) →
75
80
85
90
95
GPT-4o
Quality: 88.7 | $7.50
GPT-4o mini
Quality: 82 | $0.26
o1
Quality: 92.3 | $26.25
o3-mini
Quality: 89.1 | $1.93
Claude 3.5 Sonnet
Quality: 88.3 | $6.00
Claude 3.5 Haiku
Quality: 81.5 | $1.60
Claude 3 Opus
Quality: 86.8 | $30.00
Gemini 1.5 Pro
Quality: 85.9 | $5.25
Gemini 1.5 Flash
Quality: 78.9 | $0.13
Gemini 2.0 Flash
Quality: 84.1 | $0.18
DeepSeek-V3
Quality: 88.5 | $0.48
DeepSeek-R1
Quality: 91.8 | $0.96
Llama 3.1 405B
Quality: 88.6 | $2.70
Llama 3.3 70B
Quality: 86.2 | $0.70
Llama 3.1 8B
Quality: 73 | $0.10
Mistral Large 2
Quality: 84 | $3.00
Grok 2
Quality: 87.5 | $4.00
OpenAI
Anthropic
Google
DeepSeek
Meta
Mistral
xAI
Provider
Type
Openness
17 models
#ModelQualitySpeedLatencyPriceOutput $/1MContext
1
o1OpenAI
Reasoning
92.3
45 t/s
1.20s$26.25$60.00128k
2
DeepSeek-R1DeepSeek
ReasoningOpen
91.8
65 t/s
0.90s$0.96$2.1964k
3
o3-miniOpenAI
Reasoning
89.1
70 t/s
0.80s$1.93$4.40200k
4
GPT-4oOpenAI
88.7
105 t/s
0.35s$7.50$15.00128k
5
Llama 3.1 405BMeta
Open
88.6
55 t/s
0.60s$2.70$2.70128k
6
DeepSeek-V3DeepSeek
Open
88.5
110 t/s
0.40s$0.48$1.1064k
7
Claude 3.5 SonnetAnthropic
88.3
85 t/s
0.42s$6.00$15.00200k
8
Grok 2xAI
87.5
85 t/s
0.45s$4.00$10.00128k
9
Claude 3 OpusAnthropic
86.8
40 t/s
0.85s$30.00$75.00200k
10
Llama 3.3 70BMeta
Open
86.2
80 t/s
0.45s$0.70$0.70128k
11
Gemini 1.5 ProGoogle
85.9
90 t/s
0.55s$5.25$10.502000k
12
Gemini 2.0 FlashGoogle
84.1
175 t/s
0.28s$0.18$0.401000k
13
Mistral Large 2Mistral
84.0
75 t/s
0.50s$3.00$6.00128k
14
GPT-4o miniOpenAI
82.0
150 t/s
0.25s$0.26$0.60128k
15
Claude 3.5 HaikuAnthropic
81.5
130 t/s
0.30s$1.60$4.00200k
16
Gemini 1.5 FlashGoogle
78.9
160 t/s
0.35s$0.13$0.301000k
17
Llama 3.1 8BMeta
Open
73.0
200 t/s
0.20s$0.10$0.10128k

KEY DEFINITIONS

Context Window

Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).

Output Speed

Tokens per second received while the model is generating tokens (i.e. after first chunk has been received from the API for models which support streaming).

Latency (TTFT)

Time to first token received, in seconds, after API request sent. For reasoning models, this will be the first reasoning token.

Price

Price per token, represented as USD per million tokens. Price is a blend of Input & Output token prices (3:1 ratio).

Output Price

Price per token generated by the model (received from the API), represented as USD per million tokens.

Input Price

Price per token included in the request/message sent to the API, represented as USD per million tokens.

FREQUENTLY ASKED QUESTIONS

Based on our quality index scoring, the top-ranked model changes as providers release updates. Currently, reasoning-focused models from OpenAI and DeepSeek score the highest on overall intelligence benchmarks.

Lightweight models like Llama 3.1 8B and Gemini 2.0 Flash typically achieve the highest tokens-per-second rates, often exceeding 150-200 t/s depending on the API provider.

Open-source models served through competitive providers offer the best price-per-token. Models like Gemini 1.5 Flash and DeepSeek-V3 provide excellent quality-to-price ratios.

DeepSeek-R1 currently leads among open-weights models by quality score, followed closely by Llama 3.1 405B and DeepSeek-V3.

Use the provider filter tabs above the leaderboard table to narrow results by provider. You can also sort any column by clicking its header.

Click on any model name in the leaderboard to visit its dedicated page with detailed benchmark results, pricing breakdowns, and speed measurements across different providers.