Reasoning model
Also known as: thinking model, reasoner, extended-thinking model
A class of large language model trained or configured to allocate compute on intermediate reasoning before producing the final answer — using inference-time scaling, chain-of-thought tokens, or self-consistency sampling. 2026 frontier examples: Claude 3.7 / 4 Opus extended thinking, OpenAI o1 / o3 / o4-mini / GPT-5 Pro, Gemini 2.5 Pro Thinking + Deep Think, DeepSeek R1.
Reasoning models are not a drop-in upgrade from standard models. The cost profile is different (typically 5-20× more tokens per call), the latency profile is different (seconds-to-minutes per response rather than sub-second), and the quality profile is different (materially better on genuinely-difficult problems, not always better on routine ones). The procurement test that survives review: name the specific workload that needs reasoning, run that workload against both the reasoning and standard tier of the same vendor on the deployment's eval set, and route by step. Most production agent loops in 2026 should use a reasoning model for 5-20% of steps, not for every step.
Articles that analyse this term
Primary sources
- OpenAI. Reasoning models — o-series introduction
- Anthropic. Claude 3.7 Sonnet extended thinking