☀️ AI Morning Minute: Chain of Thought
The trick that turned AI from a guesser into a reasoner
For most of AI’s history, models gave you an answer the way a student blurts one out without working through the problem. Sometimes right, often wrong, never showing why. Then in 2022, researchers at Google noticed something strange. If you asked the model to “think step by step” before answering, it got dramatically better at hard problems. That small prompt change reshaped how the newest AI systems are built and trained.
What it means
Chain of thought is a technique where an AI generates intermediate reasoning steps before producing a final answer. Instead of jumping straight to a conclusion, the model writes out something like a rough draft of its thinking, working through the problem in plain language. The extra text gives the model room to catch its own mistakes, follow longer logical paths, and break complex problems into smaller pieces.
The reasoning models from OpenAI, Anthropic, and Google are all built on this principle, just scaled up. They generate long internal chains of thought, sometimes thousands of words, before the user sees a single sentence of the answer.
Why it matters
It unlocked tasks that older AI couldn’t touch. Math word problems, multi-step planning, coding challenges, and complex logic puzzles all jumped in accuracy. On the GSM8K math benchmark, chain of thought roughly doubled performance for large models. On graduate-level science questions, it pushed top models past the level of PhD students answering the same questions.
It’s why “reasoning models” exist as a category. The newest generation of AI systems isn’t just bigger. It’s trained specifically to use chain of thought as the default mode of operating. That’s a real shift, not a marketing one. The models are doing more thinking per question, and getting more right.
It costs more. All that thinking happens as generated text, which means more tokens, more compute, more time, and a bigger bill per question. Reasoning models can cost 5 to 20 times more per query than their non-reasoning counterparts. For some tasks the extra cost is worth it. For “what’s the capital of France,” it absolutely is not.
Simple example
A student tackles a word problem two ways. First way: stare at it, write down a number, hope it’s right. Second way: read the problem out loud, list what’s given, identify what’s being asked, set up an equation, solve it step by step, double-check the answer against the question.
The second approach takes longer and uses more paper. It also gets the right answer far more often, especially on hard problems where the first approach is basically a coin flip. Chain of thought is the AI doing the second one instead of the first. The “paper” it uses is just more text it generates, but the effect is the same. Slower, more expensive, far more reliable.

