Summary
Chain of Draft (CoD) is a novel prompting technique that makes reasoning more efficient by reducing verbosity. Unlike traditional Chain-of-Thought (CoT) prompting, CoD instructs the model to produce extremely concise, information-dense intermediate steps - typically limited to about five words per step. This approach mimics how humans often jot down quick notes when solving problems, capturing only the critical elements needed. CoD drastically reduces token usage (as little as 7.6% compared to CoT) while maintaining comparable accuracy, resulting in faster response times and lower computational costs.
Implementation
Include clear instructions in your prompt such as "think step by step, but only keep a minimum draft for each step with five words at most" and indicate that the final answer should use a clear separator. Providing few-shot examples that demonstrate the desired concise reasoning format helps the model adapt to this minimalistic style.
When to use
- High-volume requests: Lower token costs
- Latency-sensitive apps: Faster generation
- Multi-step reasoning: Efficient thought process
- Resource-constrained: Reduced compute requirements
Best for
- Cost-sensitive applications
- Real-time response scenarios
- High-throughput batch processing