Back to Patterns

Memory Tiering

Summary

Memory tiering organizes agent state into three tiers with distinct persistence and access rules. Working memory holds current-turn context, session memory spans the active conversation, and long-term memory persists across sessions.

How it works

Three tiers -- each tier has its own storage backend, retention policy, and access semantics.
Read/write rules -- agents can only access tiers appropriate to the current operation.
Promotion and eviction -- data moves between tiers based on usage patterns and importance scores.

Tiers

Working memory: Current turn state, scratchpad, intermediate results. Volatile, cleared after each response.
Session memory: Conversation history, user preferences for the session, task queue. Retained for the session duration.
Long-term memory: User identity, learned facts, persistent preferences. Stored in durable storage across sessions.

Operations

Read/write rules per tier: Tier 1 is read-write for the current turn only. Tier 2 is read-write for the session. Tier 3 is append-mostly with limited mutation.
Promotion/demotion: Frequently accessed session facts may be promoted to long-term. Stale long-term facts may be demoted or archived.
Eviction: Least recently used data is evicted first when tier capacity is reached. Eviction triggers a compaction or archival step.

Build This Pattern

Copy this prompt and paste it into Claude Code, OpenCode, Codex, or Cursor to implement this pattern.

Build me a memory tiering system. Architecture: three tiers - WORKING (current turn context), SESSION (conversation/task history for active session), LONG-TERM (persistent knowledge across sessions). Each tier has size limits and eviction policies. Error handling: handle memory corruption across tiers, conflicting memories. Edge cases: handle memory that spans tiers, promotion/demotion between tiers. Best practices: log memory operations with tier, size, access frequency. Testing: verify correct memory retrieval from the appropriate tier.