Chain of Thought
Step-by-step reasoning for complex tasks
Chain of Thought (CoT) prompting is a technique that dramatically improves AI performance on complex reasoning tasks by asking the model to show its work step by step.
Just like a math teacher asking students to show their work, CoT prompting asks the AI to make its reasoning visible.
The Problem CoT Solves
AI models can struggle with multi-step reasoning when asked to jump directly to an answer.
Direct Answer (Often Wrong)
Question: A store sells apples for $2 each. If you buy 5 or more, you get 20% off. How much do 7 apples cost? Answer: $14 ❌
Chain of Thought (Correct)
Question: A store sells apples for $2 each. If you buy 5 or more, you get 20% off. How much do 7 apples cost? Let's work through this step by step: 1. Regular price per apple: $2 2. Number of apples: 7 3. Since 7 ≥ 5, the discount applies 4. Regular total: 7 × $2 = $14 5. Discount: 20% of $14 = $2.80 6. Final price: $14 - $2.80 = $11.20 Answer: $11.20 ✓
Basic CoT Prompts
Simple Trigger Phrases
Add one of these to your prompts:
Explicit Step Request
Ask the AI to solve a problem step by step.
Solve this problem: A car rental costs $40 per day plus $0.25 per mile. If someone rents a car for 3 days and drives 200 miles, what is the total cost? Before giving your final answer: 1. Identify what information is given 2. Determine what we need to find 3. Plan your approach 4. Execute each step, showing work 5. Verify your answer
Zero-Shot vs Few-Shot CoT
There are two main approaches to Chain of Thought prompting, each with different trade-offs between simplicity and control.
Zero-Shot CoT
The simplest approach—just add a trigger phrase and let the model figure out how to reason. No examples needed. This works surprisingly well for many problems, though you have less control over the reasoning format.
If John has 3 apples and gives half to Mary, then buys 4 more, how many does he have? Let's think step by step.
Example Output:
Let's think step by step.
- John starts with 3 apples
- Half of 3 is 1.5, but we can't split apples
- If we round down: John gives 1 apple, keeps 2
- After buying 4 more: 2 + 4 = 6 apples
Answer: 6 apples
Few-Shot CoT
For more consistent results, show the model exactly how you want it to reason by providing worked examples. This gives you control over the reasoning style, format, and level of detail. The model will mimic your demonstrated pattern.
Example 1: Q: A baker has 24 cupcakes. She puts them equally into 4 boxes. Then she eats 2 cupcakes from one box. How many cupcakes total remain? A: Let's work through this: - Total cupcakes: 24 - Cupcakes per box: 24 ÷ 4 = 6 - After eating 2 from one box: 6 - 2 = 4 cupcakes in that box - Total remaining: (3 boxes × 6) + 4 = 18 + 4 = 22 cupcakes Example 2: Q: A train travels at 60 mph for 2 hours, then 40 mph for 3 hours. What is its average speed? A: Let's work through this: - Distance at 60 mph: 60 × 2 = 120 miles - Distance at 40 mph: 40 × 3 = 120 miles - Total distance: 120 + 120 = 240 miles - Total time: 2 + 3 = 5 hours - Average speed: 240 ÷ 5 = 48 mph Now solve: Q: A store sells notebooks for $3 each. If you buy 10 or more, you get 25% off. How much do 12 notebooks cost?
Example Output:
A: Let's work through this:
- Regular price: $3, Quantity: 12
- Since 12 ≥ 10, the 25% discount applies
- Regular total: 12 × $3 = $36
- Discount: 25% of $36 = $9
- Final price: $36 - $9 = $27
Structured CoT Formats
For consistent, reproducible reasoning, use structured formats with named steps. These frameworks ensure the model doesn't skip important stages and makes outputs easier to parse and verify.
The BREAK Format
A memorable acronym that guides the model through a complete problem-solving cycle, from understanding to verification.
The BREAK Framework
Solve this problem using BREAK:
B - Begin by restating the problem
R - Reason about what approach to use
E - Execute the solution step by step
A - Answer clearly
K - Know by verifying/checking
Problem: A rectangle's length is twice its width. If the perimeter is 36 cm, what is the area?The Reasoning Template
A more formal structure that separates inputs, goals, and execution. Excellent for technical problems where you need clear documentation of the solution process.
Use this template to solve the problem: GIVEN: [List known information] GOAL: [What we need to find] APPROACH: [Strategy we'll use] STEPS: 1. [First step with reasoning] 2. [Second step with reasoning] ... VERIFICATION: [Check the answer] ANSWER: [Final answer] Problem: A car uses 8 liters of fuel per 100 km. If the tank holds 45 liters and fuel costs $1.50 per liter, how much does it cost to drive 300 km?
Example Output:
GIVEN: 8L/100km, tank 45L, $1.50/L, drive 300km
GOAL: Calculate fuel cost
APPROACH: Fuel needed × cost per liter
STEPS:
1. Fuel = (300 ÷ 100) × 8 = 24 liters
2. Cost = 24 × $1.50 = $36
VERIFICATION: 24L < 45L tank ✓
ANSWER: $36
CoT for Different Problem Types
Different problem types benefit from different CoT approaches. Here are patterns optimized for common scenarios.
Mathematical Reasoning
Math problems benefit most from CoT because each step builds on the previous one. Errors compound, so showing work helps catch mistakes early. The model should clearly label each calculation.
A company's revenue was $1.2M in Q1. It grew 15% in Q2, declined 10% in Q3, and grew 25% in Q4. What was the total annual revenue? Think through each quarter, showing your calculations clearly.
Example Output:
Q1: $1,200,000 (given)
Q2: $1,200,000 × 1.15 = $1,380,000
Q3: $1,380,000 × 0.90 = $1,242,000
Q4: $1,242,000 × 1.25 = $1,552,500
Total: $5,374,500
Logical Reasoning
Logic puzzles require systematic elimination and hypothesis testing. CoT helps the model track constraints, test possibilities, and verify that all conditions are satisfied. The key is methodical exploration rather than guessing.
Three friends—Alice, Bob, and Carol—each have different pets (cat, dog, fish) and different jobs (teacher, engineer, doctor). Clues: 1. Alice doesn't have a cat 2. The engineer has a dog 3. Bob is a teacher 4. Carol doesn't have a fish Who has which pet and job? Reason through this systematically, testing hypotheses and verifying against all clues.
Example Output:
From clue 3: Bob = teacher
→ Alice and Carol are engineer/doctor
If Carol = engineer:
- Carol has dog (clue 2)
- Alice = doctor, has fish (clue 1)
- Bob has cat
Verify: All 4 clues satisfied ✓
Answer: Alice (doctor, fish), Bob (teacher, cat), Carol (engineer, dog)
Code Debugging
Debugging benefits from CoT because it forces the model to trace execution rather than guess at bugs. By walking through the code with concrete values, the actual behavior becomes visible and discrepancies with expected behavior are exposed.
This function should return the sum of even numbers in a list, but it's returning wrong values. Debug step by step by tracing through with the input [2, 3, 4, 6]:
def sum_evens(numbers):
total = 0
for n in numbers:
if n % 2 == 0:
total += 1
return total
Trace each step, find the bug, and provide the fix.Example Output:
Trace [2, 3, 4, 6]:
n=2: total += 1 → 1 (wait, adding 1 not n!)
n=3: skip (odd)
n=4: total = 2
n=6: total = 3
BUG: "total += 1" should be "total += n"
Expected: 12, Got: 3
Fix: Change to "total += n"
Self-Consistency with CoT
For critical problems, don't rely on a single reasoning path. Generate multiple independent solutions and compare results. If different approaches converge on the same answer, confidence is high. Disagreement signals the need for careful review.
Solve this problem 3 different ways, then give your final answer based on which solution appears most correct: Problem: A farmer has chickens and cows. Together they have 30 heads and 74 legs. How many chickens and how many cows does the farmer have? Approach 1: Algebra Approach 2: Guess and check Approach 3: Logical reasoning Then compare and give your final answer.
Example Output:
Approach 1 (Algebra): c + w = 30, 2c + 4w = 74 → c=23, w=7
Approach 2 (Guess): Try 23/7 → 46 + 28 = 74 ✓
Approach 3 (Logic): 30 chickens = 60 legs, need 14 more → 7 cows
All agree: 23 chickens, 7 cows
CoT for Decision Making
Decisions involve weighing trade-offs across multiple dimensions. CoT ensures all relevant factors are considered systematically rather than jumping to conclusions. This structured approach also documents the reasoning for future reference.
Should we adopt microservices architecture for our application? Context: - Monolithic application with 50,000 lines of code - Team of 5 developers - 100 daily active users - Planning for 10x growth in 2 years Think through this systematically: 1. List the current state 2. Identify factors to consider (team size, scale, velocity, future growth) 3. Weigh each factor as for/against 4. Give a recommendation with reasoning
Example Output:
FACTORS:
- Team size (5): Too small for microservices ❌
- Scale (100 DAU): No scaling need ❌
- Velocity: Monolith = faster iteration ❌
- Future growth: Uncertain timeline ⚠️
WEIGHING: 3 strong against, 1 weak for
RECOMMENDATION: Stay monolith, use clear module
boundaries to ease future transition.
When to Use CoT
Use CoT For
Math problems — Reduces calculation errors
Logic puzzles — Prevents skipped steps
Complex analysis — Organizes thinking
Code debugging — Traces execution
Decision making — Weighs trade-offs
Skip CoT For
Simple Q&A — Unnecessary overhead
Creative writing — Can constrain creativity
Factual lookups — No reasoning needed
Translation — Direct task
Summarization — Usually straightforward
CoT Limitations
While powerful, Chain of Thought isn't a silver bullet. Understanding its limitations helps you apply it appropriately.
- Increased token usage — More output means higher costs
- Not always needed — Simple tasks don't benefit
- Can be verbose — May need to ask for conciseness
- Reasoning can be flawed — CoT doesn't guarantee correctness
Summary
CoT dramatically improves complex reasoning by making implicit steps explicit. Use it for math, logic, analysis, and debugging. Trade-off: better accuracy for more tokens.
When should you NOT use Chain of Thought prompting?
In the next chapter, we'll explore few-shot learning—teaching the model through examples.