Summary

Context engineering is the practice of curating the smallest high-signal context set for each turn instead of stuffing everything into the prompt. It treats context as a finite resource to be budgeted, scored, and refreshed rather than a dumping ground for every potentially useful fact.

How it works

  1. Inventory context sources -- list all possible context inputs: conversation history, retrieved documents, tool outputs, system prompts, user profile data.
  2. Score by relevance -- for each turn, score each context source on its relevance to the current query or task.
  3. Budget allocation -- allocate a limited context window based on scores, ensuring the most critical information fits.
  4. Summarize if needed -- when relevant context exceeds the budget, summarize the lowest-scoring items to preserve signal.
  5. Deliver -- assemble the final context set and deliver it to the model in a consistent structure.

Key principles

  • Minimal viable context: Only include context the model demonstrably needs for the current turn.
  • Stale context detection: Flag and discard context that has been superseded or is older than a threshold.
  • Budget enforcement: Treat the context window as a hard constraint and prioritize aggressively.

Common pitfalls

  • Overstuffing: Including too much context dilutes signal and increases cost and latency.
  • Stale tool output: Past tool results that no longer reflect the current state mislead the model.
  • Redundant context: Repeating information across different context sources wastes budget.

Build This Pattern

Copy this prompt and paste it into Claude Code, OpenCode, Codex, or Cursor to implement this pattern.

Build me a context engineering workflow. Architecture: implement a context manager that filters, truncates and prioritizes context before each LLM call. Use a scoring system to rank context items by relevance. Include budget limits for total context size. Error handling: when context budget is exceeded, implement a summarization fallback. Edge cases: handle empty context gracefully, detect stale context, and prevent context poisoning from tool outputs. Best practices: log context composition per turn, measure signal-to-noise ratio. Testing: verify context budgets with max-size inputs.