Summary
Context engineering is the practice of curating the smallest high-signal context set for each turn instead of stuffing everything into the prompt. It treats context as a finite resource to be budgeted, scored, and refreshed rather than a dumping ground for every potentially useful fact.
How it works
- Inventory context sources -- list all possible context inputs: conversation history, retrieved documents, tool outputs, system prompts, user profile data.
- Score by relevance -- for each turn, score each context source on its relevance to the current query or task.
- Budget allocation -- allocate a limited context window based on scores, ensuring the most critical information fits.
- Summarize if needed -- when relevant context exceeds the budget, summarize the lowest-scoring items to preserve signal.
- Deliver -- assemble the final context set and deliver it to the model in a consistent structure.
Key principles
- Minimal viable context: Only include context the model demonstrably needs for the current turn.
- Stale context detection: Flag and discard context that has been superseded or is older than a threshold.
- Budget enforcement: Treat the context window as a hard constraint and prioritize aggressively.
Common pitfalls
- Overstuffing: Including too much context dilutes signal and increases cost and latency.
- Stale tool output: Past tool results that no longer reflect the current state mislead the model.
- Redundant context: Repeating information across different context sources wastes budget.