Agent Memory

A local, append-only conversational memory system for AI coding agents.

Overview

Agent Memory enables AI agents to answer questions like "what were we talking about last week?" without scanning through entire conversation histories. It provides:

TOC-based Navigation: Time-hierarchical Table of Contents (Year → Month → Week → Day → Segment) for efficient drill-down
Grips for Provenance: Excerpts linked to source events for verifiable citations
Append-only Storage: Immutable event log with RocksDB for durability
Hook-based Ingestion: Passive capture from Claude Code, OpenCode, Gemini CLI hooks
gRPC API: High-performance interface for agent integration

Core Value: Agentic Search Through Progressive Disclosure

The fundamental insight behind Agent Memory is that agents should search memory the same way they search codebases - through intelligent, hierarchical exploration rather than brute-force scanning.

This approach mirrors Progressive Disclosure Architecture (PDA), the same pattern used in well-designed Agentic Skills. Just as a skill progressively reveals complexity only when needed, Agent Memory progressively reveals conversation detail only when relevant:

Agentic Skills: Start with a simple interface, reveal advanced options as the agent needs them
Agent Memory: Start with high-level summaries, reveal raw events as the agent drills down

The key principle: Agentic search beats brute-force scanning. Instead of loading thousands of conversation events into context, an agent navigates a structured hierarchy, reading summaries at each level until it finds the area of interest, then drilling down for details.

This is how humans naturally search through information - you don't read every email to find a conversation from last week; you filter by date, scan subjects, then open the relevant thread.

Progressive Disclosure Architecture (PDA)

Agent Memory implements a 5-step progressive disclosure pattern for memory navigation:

The Navigation Pattern

Step	Level	What the Agent Sees	Decision
1	Year	"2024: 847 conversations about auth, databases, Rust"	Too broad → drill down
2	Month	"January: 156 conversations, heavy focus on authentication"	Promising → drill down
3	Week	"Week 3: JWT implementation, OAuth2 integration"	This is it → drill down
4	Day	"Thursday: Debugged JWT token expiration issue"	Found it → drill down
5	Segment/Grip	Actual conversation excerpt with event links	Verify → expand if needed

At each level, the agent reads a summary (title, bullets, keywords) and decides whether to:

Drill down: This area looks relevant, explore children
Move laterally: Check sibling nodes for better matches
Expand grip: Found the answer, get the raw events for verification

Human Analogy: Email Search

Think about how you find an important email from last month:

You don't: Read every email from the beginning of time
You do: Filter to "last month" (time-based narrowing)
You do: Scan subject lines for keywords (summary-based search)
You do: Open the thread that looks right (drill-down)
You do: Read the specific message (raw content access)

Agent Memory gives AI agents the same efficient search pattern, but structured for programmatic access via gRPC.

Concrete Example: "What did we discuss about authentication last week?"

Here's how an agent navigates the TOC to answer this query:

Agent Query: "What did we discuss about authentication last week?"

Step 1: GetTocRoot()
─────────────────────
Returns: [2024, 2023]
Agent thinks: "User said 'last week', so 2024 is current. Drill down."

Step 2: GetNode("2024")
───────────────────────
Returns:
  title: "2024"
  bullets: ["Authentication system redesign", "Database migration", "Rust learning"]
  keywords: ["auth", "jwt", "postgres", "rust"]
  children: [Jan, Feb, Mar...]
Agent thinks: "Auth mentioned. Current month is January. Drill down to January."

Step 3: GetNode("2024-01")
──────────────────────────
Returns:
  title: "January 2024"
  bullets: ["JWT implementation completed", "OAuth2 provider integration", "Session management refactor"]
  keywords: ["jwt", "oauth", "session", "token"]
  children: [Week1, Week2, Week3, Week4]
Agent thinks: "JWT and OAuth - definitely auth related. 'Last week' = Week 3. Drill down."

Step 4: GetNode("2024-W03")
───────────────────────────
Returns:
  title: "Week of January 15-21"
  bullets:
    - "Implemented JWT token refresh logic" [grip: grip-001]
    - "Fixed OAuth2 callback URL handling" [grip: grip-002]
    - "Added session timeout configuration" [grip: grip-003]
  keywords: ["jwt", "refresh", "oauth", "callback", "session", "timeout"]
  children: [Mon, Tue, Wed, Thu, Fri]
Agent thinks: "Found it! These bullets answer the question. Let me verify with grips."

Step 5: ExpandGrip("grip-001")
──────────────────────────────
Returns:
  excerpt: "User asked how to handle JWT refresh tokens without logging out"
  events: [
    {role: "user", text: "How do we refresh JWT tokens without forcing logout?"},
    {role: "assistant", text: "We can implement a sliding window refresh..."},
    {role: "user", text: "What about the race condition when multiple tabs..."}
  ]
Agent thinks: "Verified. I can now answer with confidence and cite the source."

Final Answer:
"Last week we discussed JWT token refresh implementation. Specifically, you asked
about refreshing tokens without forcing logout, and we solved the multi-tab race
condition using a sliding window approach. [Source: conversation from January 18]"

Why PDA is Primary (Vector Search is an Accelerator)

Approach	Tokens Used	Accuracy	Verifiability
Brute-force scan	50,000+	Medium	High (has source)
Vector similarity alone	2,000	Medium	Low (no context)
PDA navigation	500	High	High (grips link to source)
PDA + Vector teleport	300	High	High (best of both)

Vector search alone might return "JWT refresh logic" as a match, but without the surrounding context, the agent can't verify if it's the right conversation or understand the full discussion. PDA gives both the answer AND the provenance.

In Phase 2, we add vector and BM25 indexes as teleport accelerators - they help the agent jump directly to promising TOC nodes, but the agent still navigates the hierarchy to get context. This combines the speed of similarity search with the verifiability of structured navigation.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        AI Agent (Claude Code, etc.)             │
└─────────────────────────────────────────────────────────────────┘
                              │
                              │ gRPC
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                        Memory Daemon                            │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │  Ingestion  │  │    Query    │  │   TOC Builder           │  │
│  │  Service    │  │   Service   │  │   (Background)          │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
│                              │                                   │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │                    Storage Layer (RocksDB)                 │  │
│  │  ┌────────┐ ┌──────────┐ ┌───────┐ ┌────────┐ ┌─────────┐ │  │
│  │  │ Events │ │ TOC Nodes│ │ Grips │ │ Outbox │ │Checkpts │ │  │
│  │  └────────┘ └──────────┘ └───────┘ └────────┘ └─────────┘ │  │
│  └───────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
                              │
                              │ Hooks
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                      Hook Handlers                              │
│  (code_agent_context_hooks - external repository)               │
└─────────────────────────────────────────────────────────────────┘

Core Concepts

Table of Contents (TOC)

The TOC is a time-based hierarchy that summarizes conversations:

2024 (Year)
├── January (Month)
│   ├── Week 1 (Week)
│   │   ├── Monday (Day)
│   │   │   ├── Segment 1: "Discussed auth implementation"
│   │   │   └── Segment 2: "Debugged database connection"
│   │   └── Tuesday (Day)
│   │       └── ...
│   └── Week 2
│       └── ...
└── February
    └── ...

Each node contains:

Title: Human-readable period name
Bullets: Summary points with linked grips
Keywords: For search/filtering
Children: Links to child nodes for drill-down

Grips (Provenance)

Grips anchor summary bullets to source evidence:

Grip {
    excerpt: "User asked about Rust memory safety",
    event_id_start: "evt:1706540400000:01HN4QXKN6...",
    event_id_end: "evt:1706540500000:01HN4QXYZ...",
    timestamp: 2024-01-29T10:00:00Z,
    source: "segment_summarizer"
}

When an agent reads a summary bullet, it can expand the grip to see the original conversation context.

Events

Events are the immutable records of agent interactions:

Event {
    event_id: "01HN4QXKN6YWXVKZ3JMHP4BCDE",
    session_id: "session-123",
    timestamp: 2024-01-29T10:00:00Z,
    event_type: "user_message",
    role: "user",
    text: "How does Rust prevent memory leaks?",
    metadata: {"project": "agent-memory"}
}

Quick Start

Prerequisites

Rust 1.82+
protoc (Protocol Buffers compiler)

Build

cargo build --release

Start the Daemon

# Start with defaults (port 50051, db at ~/.memory-store)
./target/release/memory-daemon start

# Start with custom settings
./target/release/memory-daemon start --port 50052 --db-path /path/to/db

Stop/Status

# Stop the daemon
./target/release/memory-daemon stop

# Check if running
./target/release/memory-daemon status

Configuration

Settings can be provided via (highest priority first):

Command-line flags
Environment variables (MEMORY_* prefix)
Config file (~/.config/memory-daemon/config.toml)
Defaults

Environment variables:

MEMORY_PORT - gRPC port (default: 50051)
MEMORY_DB_PATH - RocksDB path (default: ~/.memory-store)
MEMORY_LOG_LEVEL - Log verbosity (default: info)

CLI Commands

# Query commands (connect to running daemon)
memory-daemon query --endpoint http://[::1]:50051 root
memory-daemon query --endpoint http://[::1]:50051 node --node-id "toc:year:2026"
memory-daemon query --endpoint http://[::1]:50051 browse --parent-id "toc:year:2026" --limit 10
memory-daemon query --endpoint http://[::1]:50051 events --from 1706745600000 --to 1706832000000 --limit 100
memory-daemon query --endpoint http://[::1]:50051 expand --grip-id "grip:123:abc" --before 3 --after 3

# Admin commands (direct storage access)
memory-daemon admin --db-path ~/.memory-store stats
memory-daemon admin --db-path ~/.memory-store compact
memory-daemon admin --db-path ~/.memory-store compact --cf events

Run the Demo

./scripts/demo.sh

This script starts the daemon, ingests sample events, and demonstrates querying.

Project Structure (Monorepo)

agent-memory/
├── crates/                   # Rust crates (server, client, shared)
│   ├── memory-daemon/        # Server binary
│   ├── memory-service/       # gRPC service implementation
│   ├── memory-client/        # Client library for hook handlers
│   ├── memory-storage/       # RocksDB storage layer
│   ├── memory-toc/           # TOC building logic
│   └── memory-types/         # Shared types (Event, TocNode, Grip)
├── plugins/                  # Claude Code marketplace plugins
│   └── memory-query-plugin/  # Memory query plugin
│       ├── .claude-plugin/   # Plugin manifest
│       ├── skills/           # Core skill
│       ├── commands/         # Slash commands (/memory-search, etc.)
│       └── agents/           # Autonomous agents
├── proto/
│   └── memory.proto          # gRPC service definitions
├── docs/
│   └── README.md             # This file
├── scripts/                  # Helper scripts
└── .planning/                # Development planning documents

Crates Overview

Crate	Type	Description
memory-daemon	Binary	gRPC server with start/stop/status commands
memory-service	Library	gRPC service implementation
memory-client	Library	Client for hook handlers to ingest events
memory-storage	Library	RocksDB storage with column families
memory-toc	Library	TOC building, summarization, rollups
memory-types	Library	Shared domain types

Skills/Plugins

Plugin	Description
memory-query-plugin	Query past conversations with /memory-search, /memory-recent, /memory-context

Development Phases

Phase	Description	Status
1. Foundation	Storage, types, gRPC scaffolding, daemon	Complete
2. TOC Building	Segmentation, summarization, hierarchy	Complete
3. Grips & Provenance	Excerpt storage, linking, expansion	Complete
4. Query Layer	Navigation RPCs, event retrieval	Complete
5. Integration	Hook handlers, CLI, admin commands	Complete
6. End-to-End Demo	Full workflow validation	Complete
7. Agentic Plugin	Claude Code plugin with commands, agents	Complete
8. CCH Integration	Automatic event capture via hooks	Complete

Phase 2: Teleport Indexes (Accelerators)

While TOC navigation is the primary search mechanism, Phase 2 adds teleport indexes as accelerators for direct jumps into the hierarchy:

BM25 Keyword Search (Tantivy) - Full-text search over event content and TOC summaries. Query "JWT refresh" returns matching TOC node IDs and grip IDs, letting the agent teleport directly to relevant time periods.
Vector Similarity Search (HNSW) - Semantic search using embeddings. Query "how did we handle token expiration" finds conceptually similar conversations even if the exact words weren't used.

Key principle: Teleports return pointers (node IDs, grip IDs), not content. The agent still navigates the TOC to get context and verify relevance. Indexes are disposable accelerators - if they fail or drift, TOC navigation still works.

┌─────────────────────────────────────────────────────────────┐
│                    Teleport Indexes                          │
│  ┌─────────────────┐        ┌─────────────────┐             │
│  │ BM25 (Tantivy)  │        │ Vector (HNSW)   │             │
│  │ Keyword search  │        │ Semantic search │             │
│  └────────┬────────┘        └────────┬────────┘             │
│           │                          │                       │
│           └──────────┬───────────────┘                       │
│                      ▼                                       │
│              Return node_ids / grip_ids                      │
│                      │                                       │
│                      ▼                                       │
│              Agent navigates TOC from entry point            │
└─────────────────────────────────────────────────────────────┘

Phase 3: Graph Database (Under Discussion)

For v2, we're evaluating whether to add a graph database layer to capture relationships that don't fit the time hierarchy:

Entity relationships: "Project X" mentioned across multiple conversations
Topic threads: Authentication discussions spanning weeks
Cross-references: "As we discussed on Tuesday" links

This would complement (not replace) the TOC. The graph would provide alternative navigation paths while TOC remains the primary structure. Technologies under consideration include embedded graph stores or extending RocksDB with graph-like indexes.

Technology Stack

Component	Technology	Rationale
Language	Rust	Single binary, fast scans, predictable memory
Storage	RocksDB	Embedded, fast range scans, column families
API	gRPC (tonic)	Clean contract, efficient serialization
Summarizer	Pluggable	API (Claude/GPT) or local inference

Query Tools

Agents interact with memory through these gRPC operations:

Operation	Description
`get_toc_root`	Top-level time periods
`get_node(node_id)`	Drill into specific period
`get_events(time_range)`	Raw events (last resort)
`expand_grip(grip_id)`	Context around excerpt
`teleport_query(query)`	Index-based jump (v2)

Event Types

Events are captured via agent hooks with zero token overhead:

Hook Event	Memory Event
SessionStart	session_start
UserPromptSubmit	user_message
PostToolUse	tool_result
Stop	assistant_stop
SubagentStart	subagent_start
SubagentStop	subagent_stop
SessionEnd	session_end

Key Design Decisions

TOC as Primary Navigation: Agentic search via hierarchical drill-down beats brute-force scanning
Append-Only Storage: Immutable truth, no deletion complexity
gRPC Only: No HTTP server overhead
Per-Project Stores: Simpler mental model, configurable for unified mode
Hook-Based Ingestion: Zero token overhead, passive capture

Out of Scope (v1)

The following are excluded from v1:

~~Graph database~~ → Under discussion for v2 (see Phase 3 above)
Multi-tenant support (single agent, local deployment)
Delete/update events (append-only truth)
HTTP API (gRPC only)
MCP integration (hooks are passive, no token overhead)

CCH Integration

Agent Memory integrates with code_agent_context_hooks (CCH) to automatically capture conversation events from Claude Code and other AI coding agents.

Quick Setup

# 1. Build the hook handler
cargo build --release -p memory-ingest

# 2. Install to local bin
mkdir -p ~/.local/bin
cp target/release/memory-ingest ~/.local/bin/

# 3. Start the memory daemon
./target/release/memory-daemon start

# 4. Copy the hooks configuration
cp examples/hooks.yaml ~/.claude/hooks.yaml

How It Works

The memory-ingest binary is a lightweight CCH hook handler that:

Reads CCH JSON events from stdin
Maps them to memory events using memory-client
Sends them to the daemon via gRPC
Returns {"continue":true} to stdout

The binary is designed to be fast (<100ms) and fail-open - if the daemon is down, it still returns success to avoid blocking Claude.

Event Capture

CCH sends these events to agent-memory:

CCH Event	Memory Event	Description
SessionStart	session_start	New conversation started
UserPromptSubmit	user_message	User submitted a prompt
PostToolUse	tool_result	Tool execution completed
SessionEnd/Stop	session_end	Conversation ended
SubagentStart	subagent_start	Subagent spawned
SubagentStop	subagent_stop	Subagent completed

Configuration

The example hooks.yaml includes:

rules:
  - name: capture-to-memory
    matchers:
      operations:
        - SessionStart
        - UserPromptSubmit
        - PostToolUse
        - SessionEnd
        - SubagentStart
        - SubagentStop
    actions:
      run: "~/.local/bin/memory-ingest"

Testing the Integration

# Test with a sample event (daemon not required for this test)
echo '{"hook_event_name":"UserPromptSubmit","session_id":"test-123","message":"Hello world"}' | ./target/release/memory-ingest
# Expected output: {"continue":true}

# Test with daemon running
./target/release/memory-daemon start
echo '{"hook_event_name":"SessionStart","session_id":"test-123"}' | ./target/release/memory-ingest

# Verify events were captured
./target/release/memory-daemon query events --from 0 --to $(date +%s)000 --limit 10

Troubleshooting

Events not being captured:

Verify daemon is running: memory-daemon status
Check binary is installed: which memory-ingest
Test manually with echo command above

Daemon connection errors:

The binary fails open - events are lost but Claude continues
Check daemon port: default is 50051
Set custom endpoint: export MEMORY_ENDPOINT="http://localhost:50052"

Hook not triggering:

Verify hooks.yaml is in correct location (~/.claude/hooks.yaml for Claude Code)
Check hooks.yaml syntax with a YAML validator
Ensure CCH is properly installed and configured

Related Projects

code_agent_context_hooks - Hook handlers for Claude Code that feed events into this memory system

License

MIT

Contributing

See .planning/PROJECT.md for detailed project context and roadmap.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.cargo		.cargo
.planning		.planning
crates		crates
docs		docs
examples		examples
plugins/memory-query-plugin		plugins/memory-query-plugin
proto		proto
scripts		scripts
.gitignore		.gitignore
Cargo.toml		Cargo.toml

SpillwaveSolutions/agent-memory

Folders and files

Latest commit

History

Repository files navigation

Agent Memory

Overview

Core Value: Agentic Search Through Progressive Disclosure

Progressive Disclosure Architecture (PDA)

The Navigation Pattern

Human Analogy: Email Search

Concrete Example: "What did we discuss about authentication last week?"

Why PDA is Primary (Vector Search is an Accelerator)

Architecture

Core Concepts

Table of Contents (TOC)

Grips (Provenance)

Events

Quick Start

Prerequisites

Build

Start the Daemon

Stop/Status

Configuration

CLI Commands

Run the Demo

Project Structure (Monorepo)

Crates Overview

Skills/Plugins

Development Phases

Phase 2: Teleport Indexes (Accelerators)

Phase 3: Graph Database (Under Discussion)

Technology Stack

Query Tools

Event Types

Key Design Decisions

Out of Scope (v1)

CCH Integration

Quick Setup

How It Works

Event Capture

Configuration

Testing the Integration

Troubleshooting

Related Projects

License

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages