#Chain-of-Thought (CoT) Prompting¶

The technique that unlocked LLM reasoning on math, logic, and multi-step problems.

The Problem¶

Ask directly:

text

3 lines

1Q: A shop had 23 apples. It sold 7, then received a delivery
2   of 12. How many apples now? Answer with a number only.
3A: 28

Wrong. The model "blurted" a plausible-looking number because answering immediately gives it no room to compute.

The Fix: Ask It to Think Step by Step¶

text

3 lines

1Q: A shop had 23 apples. It sold 7, then received a delivery
2   of 12. How many apples now?
3Let's think step by step.

Start: 23. After selling 7: 23 − 7 = 16. After delivery of 12: 16 + 12 = 28. Answer: 28.

By generating intermediate tokens, the model uses computation as scratch space. Reasoning happens in the output, so it must be allowed to produce it before the final answer.

Zero-Shot CoT¶

Just append a trigger phrase:

"Let's think step by step."
"Work through this carefully before answering."
"First, reason about the problem. Then give the final answer."

Few-Shot CoT¶

Even stronger: show worked examples with reasoning, then the new question.

text

5 lines

1Q: Roger has 5 balls. He buys 2 cans of 3 balls each. How many?
2A: 5 + 2×3 = 5 + 6 = 11. The answer is 11.
3
4Q: A cafe had 20 muffins, sold 13, baked 9 more. How many?
5A:

Separate Reasoning From the Final Answer¶

So your code can parse the answer cleanly:

text

2 lines

1Think step by step inside <reasoning></reasoning> tags,
2then give ONLY the final numeric answer inside <answer></answer> tags.

Then extract the <answer> content and discard the reasoning.

When To Use / Avoid¶

Use CoT	Skip CoT
Math, logic, planning, multi-hop questions	Simple lookups / classification
"Why" and "how" analytical tasks	Latency-critical, trivial tasks

CoT costs extra tokens and latency — it's a tool for hard problems, not every prompt.

Key insight: Reasoning models think because the tokens of thought are part of generation. Never force a hard problem to answer in zero tokens of reasoning.

The Fix: Ask It to Think Step by Step¶

text

3 lines

1Q: A shop had 23 apples. It sold 7, then received a delivery
2   of 12. How many apples now?
3Let's think step by step.

Start: 23. After selling 7: 23 − 7 = 16. After delivery of 12: 16 + 12 = 28. Answer: 28.

By generating intermediate tokens, the model uses computation as scratch space. Reasoning happens in the output, so it must be allowed to produce it before the final answer.

When To Use / Avoid¶

Use CoT	Skip CoT
Math, logic, planning, multi-hop questions	Simple lookups / classification
"Why" and "how" analytical tasks	Latency-critical, trivial tasks

CoT costs extra tokens and latency — it's a tool for hard problems, not every prompt.

Key insight: Reasoning models think because the tokens of thought are part of generation. Never force a hard problem to answer in zero tokens of reasoning.

Chain-of-Thought Prompting

#Chain-of-Thought (CoT) Prompting¶

The Problem¶

The Fix: Ask It to Think Step by Step¶

Zero-Shot CoT¶

Few-Shot CoT¶

Separate Reasoning From the Final Answer¶

When To Use / Avoid¶

Chain-of-Thought Prompting

#Chain-of-Thought (CoT) Prompting¶

The Problem¶

The Fix: Ask It to Think Step by Step¶

Zero-Shot CoT¶

Few-Shot CoT¶

Separate Reasoning From the Final Answer¶

When To Use / Avoid¶