Some LLMs improve difficult-task performance by spending more internal computation on problems such as math, coding, and logic.

Summary

Karpathy’s practical advice is to use fast models by default, then switch to a thinking/reasoning model when the problem is hard or the first answer needs deeper work.

When To Use

Use thinking models for:

  • Debugging difficult code.
  • Math and logic.
  • Multi-step reasoning.
  • Ambiguous technical diagnosis.
  • High-value decisions where latency is acceptable.

Avoid them for:

  • Simple recall.
  • Basic ideation.
  • Low-stakes chat.
  • Tasks where extra latency adds no value.

Sources

Open Questions

  • Which tasks in this wiki deserve a thinking model?
  • Should lint passes use a thinking model by default?