Claude Code is Anthropic's command-line agent for working in codebases. It reads files, writes code, runs commands, and reasons through complex multi-step tasks. It is, in many respects, the best coding agent available today. It has one significant gap: every time you start a new session, it forgets everything from the last one.

Your agent investigated a tricky deployment bug on Monday. On Tuesday, you ask about the same system and it has no idea what you're talking about. You explained your database schema, your naming conventions, your preferred testing patterns — all gone. The context window resets, and you start from scratch.

This post walks through how to give Claude Code persistent memory using Engram and the Model Context Protocol (MCP). By the end, your agent will be able to recall decisions, investigations, and context from past sessions — across projects, across days, across weeks.

The problem: Claude Code forgets everything

Claude Code stores session transcripts as JSONL files on disk at ~/.claude/projects/<project-hash>/<session-id>.jsonl. These files contain the full record of everything that happened in a session — every message, every tool call, every response. But Claude Code doesn't read old session files when starting a new one. Each session begins with a blank context window.

Claude Code does have a built-in memory system: the CLAUDE.mdfile. You can write notes there, and the agent reads them at the start of every session. This is useful for static facts — project conventions, file paths, API patterns. But it doesn't scale for the kind of rich, evolving context that accumulates over weeks of development: investigation logs, debugging sessions, architectural decisions with their full reasoning, conversations about tradeoffs.

What you actually want is for the agent to be able to search its own past. Not a flat file it reads top to bottom, but a semantic index over every conversation it has ever had, retrievable by meaning at query time.

What is MCP memory

The Model Context Protocol (MCP) is an open standard for connecting AI agents to external tools and data sources. Instead of building custom integrations, an agent that speaks MCP can discover and use any MCP server — databases, APIs, file systems, or in this case, a memory service.

An MCP memory server gives your agent a small set of operations: create a conversation, append messages to it, search across all stored conversations by meaning, and retrieve specific conversations by ID. The agent decides when to store context and when to search for it. The memory server handles persistence, embedding, indexing, and retrieval.

Engram is an MCP memory server built for exactly this use case. It stores verbatim conversation transcripts — not extracted summaries or knowledge-graph nodes — and makes them searchable via semantic similarity. When Claude Code asks “what did we decide about the database schema?”, Engram finds the actual conversation where that decision was made and returns the full text.

Setting up Engram with Claude Code

There are two parts to the setup: a background daemon that automatically captures every Claude Code session, and an MCP server connection that lets Claude Code search its memory during a session. You can use either or both.

Step 1: Get an API key

Sign up at getengram.app. The free tier is enough to get started. Copy your API key — it looks like engram_sk_live_....

Step 2: Install the daemon

The Engram daemon watches your Claude Code session files as they're written and syncs them to your account in real time. Install it via Homebrew or npm:

# Homebrew (recommended)
brew tap get-engram/engram
brew install engram
engram auth login engram_sk_live_YOUR_KEY
brew services start engram

# Or via npm
npm install -g @getengram/cli
engram auth login engram_sk_live_YOUR_KEY
engram start --install

The daemon starts on login, restarts if it crashes, and queues messages offline if the network is unavailable. After your next Claude Code session, verify it's working:

engram log

You should see your session listed with the project name, branch, and message count.

Step 3: Connect the MCP server

The daemon handles capture. To let Claude Code search its memory during a session, add Engram as an MCP server. You can do this globally (all projects) or per-project.

Global setup — add to ~/.claude/settings.json:

{
  "mcpServers": {
    "engram": {
      "type": "url",
      "url": "https://mcp.getengram.app/mcp",
      "headers": {
        "Authorization": "Bearer engram_sk_live_YOUR_KEY"
      }
    }
  }
}

Per-project setup — add a .mcp.json file to the project root:

{
  "mcpServers": {
    "engram": {
      "type": "url",
      "url": "https://mcp.getengram.app/mcp",
      "headers": {
        "Authorization": "Bearer engram_sk_live_YOUR_KEY"
      }
    }
  }
}

Start a Claude Code session and ask it to search Engram for anything — if you get results (or an empty results array), the connection is working.

For the full setup walkthrough, including troubleshooting, see the Claude Code integration guide in the docs.

What it looks like in practice

Once Engram is connected, the change is immediate. Here's a real-world example of cross-session memory in action:

Session 1 (Monday):

You: Let's use Postgres instead of MySQL for the new service.
     The JSONB support is better for the catalog schema.

Claude Code: [searches Engram — no prior context found]
Claude Code: I'll set up Postgres. Creating the schema now...
Claude Code: [stores to Engram:
  title: "Database choice for new service"
  tags: ["architecture", "database"]
  content: decision to use Postgres, reasoning about JSONB support]

Session 2 (Thursday):

You: Set up the database for the new service.

Claude Code: [searches Engram → finds Monday's conversation]
Claude Code: Setting up Postgres — we decided on Monday to use it
             instead of MySQL because of the JSONB support for the
             catalog schema. Creating the migrations now...

No re-explaining. No lost context. The agent searched its memory, found the relevant prior conversation, and picked up where it left off. This works across sessions, across days, and across projects if you use the same API key.

Claude Code understands how Engram complements its built-in memory without being told. When it discovers the MCP tools, it recognizes that Engram is for rich, searchable context — investigation logs, decision reasoning, debugging sessions — while CLAUDE.md is for quick-reference facts. It uses both.

How it works under the hood

Three design decisions define how Engram stores and retrieves memory:

Verbatim storage.Engram stores the full text of every conversation, not extracted facts or summaries. Other memory tools distill “User prefers Postgres” from a conversation. Engram keeps the original: “I tried Postgres 16 but the JSONB GIN indexes were 30% slower than MongoDB for our specific nested-document query shape, we might revisit when 17 ships.” The full context is what you actually want to retrieve six months later when someone asks “should we revisit?”

Semantic search.Every stored conversation is embedded using a vector model (bge-base-en-v1.5) and indexed in a vector database. When Claude Code searches for “database decision,” it doesn't do keyword matching — it finds conversations that are semantically similarto the query, even if they don't share exact words. A search for “why did we pick Postgres” will find a conversation about “JSONB support for the catalog schema” because the meanings are close.

Chunk-based retrieval.Conversations are split into overlapping chunks (5-message windows with 1-message overlap). Each chunk gets its own embedding. At query time, Engram retrieves the most relevant chunks and hydrates the full message text from the database. This means retrieval is precise — you get the specific part of a long conversation that's relevant, not the entire thing — while still carrying enough surrounding context to be useful.

The entire pipeline runs on Cloudflare Workers. Ingestion, embedding, indexing, and retrieval all happen inside a single data center using internal RPC bindings instead of HTTP hops. The result is that a search query — embed, vector lookup, hydrate, return — typically resolves in under 100 ms. For more on the infrastructure choices, see our post on building Engram on Cloudflare Workers.

Going further with CLAUDE.md

Out of the box, Claude Code will use Engram when it seems useful. But you can make the behavior more deliberate by adding memory instructions to your project's CLAUDE.md file. Since Claude Code reads this file at the start of every session, you can tell it exactly when to search and what to store:

## Engram Memory

You have access to Engram as an MCP server.

### On session start
Search Engram for context relevant to the current task.
Include any relevant results in your working context.

### During the session
When significant decisions are made or complex bugs are investigated,
store the context in Engram with descriptive tags.

### What to store
- Decisions and their reasoning
- Bug investigations and resolutions
- Architecture discussions
- User preferences and workflow patterns

### What NOT to store
- Routine file reads (these are in git)
- Trivial exchanges
- Information already in the codebase

With this in place, the agent searches Engram at the start of every session and stores important context as it goes. No manual intervention needed. The full CLAUDE.md pattern is documented in the integration guide.

Getting started

Engram is free to start with, no self-hosting required. The setup takes about two minutes:

Sign up at getengram.app and grab your API key
Install the daemon (brew install engram) to auto-capture sessions
Add the MCP config to ~/.claude/settings.json or .mcp.json so Claude Code can search its memory

The getting started guide covers the full setup, and the Claude Code guide goes deeper on the CLAUDE.md patterns, team sharing, and cross-tool memory. The source is on GitHub if you want to see how it all fits together.

Written by the Engram team. Published June 17, 2026. Questions or feedback: hello@getengram.app.