Model Rankings

AI models ranked

Curated rankings of frontier and open-source AI models for developers. Scores are calibrated against the current frontier so older generations do not look artificially competitive.

58 models - page 1 of 5

🥇

GPT-5.6 Sol

⚡ Top PickTextCode

OpenAI1M ctx

OpenAI's GPT-5.6 flagship tier for frontier reasoning, long-horizon coding agents, cybersecurity analysis, biology workflows, and the hardest knowledge work. Sol introduces deeper reasoning controls, including max effort and an ultra mode that can coordinate subagents for complex work.

Frontier-relative98

Coding

Reasoning

Instruction

Speed

Cost eff.

Cost$5 in / $30 out per 1M

Best for

Frontier codingDeep reasoningCybersecuritySubagent orchestration

🥈

GPT-5.6 Terra

💻 Best CodingTextCode

OpenAI1M ctx

OpenAI's balanced GPT-5.6 tier for everyday agentic coding, product engineering, analysis, tool workflows, and production assistants. Terra is positioned near GPT-5.5 capability while cutting the token price in half, making it the default choice when Sol's deepest reasoning is not required.

Frontier-relative97

Coding

Reasoning

Instruction

Speed

Cost eff.

Cost$2.5 in / $15 out per 1M

Best for

Everyday agentsCoding valueTool workflowsBalanced cost

🥉

Claude Opus 4.8

💻 Best CodingTextCodeImage

Anthropic1M ctx

Anthropic's current Opus-tier model for complex reasoning, agentic coding, and high-autonomy workflows. It keeps the strong Claude coding profile with a 1M-token context window and a lower price than Fable 5.

Frontier-relative96

Coding

Reasoning

Instruction

Speed

Cost eff.

Cost$5 in / $25 out per 1M

Best for

Agentic codingHigh autonomy1M contextClaude Code

Claude Fable 5

⚡ Top PickTextCodeImage

Anthropic1M ctx

Anthropic's most capable widely released Claude model, built for demanding reasoning and long-horizon agentic work. Use it when autonomy, context depth, and complex multi-step execution matter more than low latency.

Frontier-relative95

Coding

Reasoning

Instruction

Speed

Cost eff.

Cost$10 in / $50 out per 1M

Best for

Long-horizon agentsComplex reasoningHigh autonomy1M context

Gemini 3.5 Flash

TextCodeImage

Google1.048576M ctx

Google's stable Gemini 3.5 production model for sustained frontier performance with strong coding, agentic loops, grounding, tool use, and multimodal inputs. It balances capability, 1M-token context, and production cost better than older Gemini 2.5 entries.

Frontier-relative95

Coding

Reasoning

Instruction

Speed

Cost eff.

Cost$1.5 in / $9 out per 1M

Best for

Agent loops1M contextGroundingMultimodal input

Grok 4.3

TextCodeImage

xAI1M ctx

xAI's newest flagship chat model with configurable reasoning, agentic tool calling, low hallucination positioning, image input, and a 1M-token context window. Use server-side search tools when current events or live data matter.

Frontier-relative95

Coding

Reasoning

Instruction

Speed

Cost eff.

Cost$1.25 in / $2.5 out per 1M

Best for

Agentic toolsReasoning control1M contextSearch workflows

GPT-5.5

⚡ Top PickTextCode

OpenAI1M ctx

OpenAI's newest flagship model for agentic coding, professional knowledge work, data analysis, computer use, and long-running tool workflows. It is positioned as a step up from GPT-5.4 with stronger system understanding, better debugging behavior, and API availability for production developers.

Frontier-relative94

Coding

Reasoning

Instruction

Speed

Cost eff.

Cost$5 in / $30 out per 1M

Best for

Agentic codingComputer useTool workflowsProfessional work

GPT-5.6 Luna

⚡ FastestTextCode

OpenAI1M ctx

OpenAI's fast, lowest-cost GPT-5.6 tier for high-volume assistance, routing, extraction, quick code edits, support automation, and latency-sensitive workflows. Luna gives teams a current-generation model for scale while preserving stronger reasoning than older budget models.

Frontier-relative91

Coding

Reasoning

Instruction

Speed

Cost eff.

Cost$1 in / $6 out per 1M

Best for

High volumeFast assistanceRoutingLow-cost coding

Gemini 3.1 Pro Preview

TextCodeImage

Google1.048576M ctx

Google's preview Pro model optimized for software engineering behavior, tool use, agentic workflows, stronger thinking, and grounded multimodal reasoning. Use it for evaluation and advanced workflows where preview volatility is acceptable.

Frontier-relative91

Coding

Reasoning

Instruction

Speed

Cost eff.

Cost$2 in / $12 out per 1M

Best for

Vibe codingTool useGrounded reasoning1M context

#10

DeepSeek V4 Pro

🔓 Open SourceTextCode

DeepSeek1M ctx

DeepSeek's V4 Pro open-weight model for agentic coding, math, STEM reasoning, and 1M-context workflows. It supports OpenAI-compatible and Anthropic-compatible APIs with thinking and non-thinking modes.

Frontier-relative90

Coding

Reasoning

Instruction

Speed

Cost eff.

Cost$0.435 in / $0.87 out per 1M

Best for

Open weights1M contextThinking modeLow API cost

#11

Claude Sonnet 4.6

⚡ Top PickTextCode

Anthropic1M ctx

Anthropic's current frontier model. State-of-the-art on SWE-bench, best-in-class instruction following, and extended thinking built-in. The go-to for agentic coding workflows.

Frontier-relative90

Coding

Reasoning

Instruction

Speed

Cost eff.

Cost$3 in / $15 out per 1M

Best for

Agentic codingSWE-bench leaderExtended thinkingInstruction accuracy

#12

GPT-5.3 Codex

💻 Best CodingCodeText

OpenAI400K ctx

OpenAI's Codex-specialized model for long-running software engineering, terminal work, refactors, frontend implementation, and defensive security review. It is currently available across paid Codex surfaces while OpenAI works toward API access.

Frontier-relative89

Coding

Reasoning

Instruction

Speed

Cost eff.

Cost$5 in / $30 out per 1M

Best for

Codex agent workRefactorsTerminal tasksSecurity review