What is mcp-memory?
mcp-memory is a drop-in replacement for Anthropic’s MCP Memory server. It provides a persistent knowledge graph where AI agents store entities, observations, and relationships — and retrieve them across sessions.
It keeps full API compatibility with Anthropic’s 8 tools while adding semantic search, hybrid retrieval, and a dynamic scoring engine. All data is stored in SQLite with WAL mode for safe concurrent access. See the Architecture page for a deep dive into how it works.
Why it exists
The official Anthropic server stores the entire knowledge graph in a single JSONL file. This works for demos, but breaks under real usage:
| Dimension | JSONL (Anthropic) | mcp-memory |
|---|---|---|
| Indexing | None — full file scan on every query | SQLite indexes on name, type, and content |
| Semantic search | Not available | KNN with ONNX embeddings (94+ languages) |
| Hybrid search | Not available | KNN + FTS5 via RRF |
| Query routing | Not available | Dynamic 3-strategy routing (COSINE_HEAVY/LIMBIC_HEAVY/HYBRID_BALANCED) |
| Limbic scoring | Not available | Salience + temporal decay + co-occurrence with temporal decay |
| Entity splitting | Not available | Automatic semantic clustering based splitting with approval workflow |
| A/B testing | Not available | Shadow mode with NDCG@K metrics |
| Auto-tuning | Not available | Grid search for GAMMA/BETA_SAL optimization |
| Concurrency | Race conditions confirmed | SQLite WAL with 5-second busy timeout |
| Scale | Degrades linearly with file size | O(log n) indexed queries |
| Data corruption | Documented in issues #1819, #2579 (May 2025, still open) | ACID transactions with auto-rollback |
The official server rewrites the entire file on every operation. Without locking or atomic writes, concurrent operations produce JSON merging and duplicate lines. mcp-memory solves these problems at the root with a storage engine designed for persistent data.
Requirements
- Python >= 3.12
- uv (recommended) or pip for dependency management
- Git for cloning the repository
- ~465 MB disk space if you download the embedding model (optional)
- ~50 MB for test suite (402 tests passing)
Installation
1. Clone the repository
git clone https://github.com/Yarlan1503/mcp-memory.git
cd mcp-memory
2. Install dependencies
uv sync
uv sync creates a virtual environment, resolves all dependencies from pyproject.toml, and generates the mcp-memory entry point.
3. Download the embedding model
uv run python scripts/download_model.py
The model is downloaded automatically on first use. The script below is provided for manual/offline setups.
This downloads the sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 sentence model (~465 MB) to ~/.cache/mcp-memory-v2/models/:
| File | Purpose |
|---|---|
model.onnx | ONNX-exported model for CPU inference |
tokenizer.json | HuggingFace fast tokenizer (Rust) |
tokenizer_config.json | Tokenizer configuration |
special_tokens_map.json | Special token mappings |
:::tip
The model download is optional. The server starts and runs all 19 tools without it. Only search_semantic, find_duplicate_observations, and search_reflections require the model. See Without the model below.
:::
4. Verify the installation
uv run mcp-memory
The server starts as a stdio process. It registers as "memory" in the MCP protocol, listens for JSON-RPC on stdin, and writes logs to stderr (no interference with MCP communication).
:::note You won’t see output on stdout — that’s correct. The server communicates via the MCP protocol (JSON-RPC over stdio). Logs go to stderr. :::
Configuration
OpenCode
Add to the mcp section of your opencode.json:
{
"mcp": {
"memory": {
"command": "uv",
"args": ["--directory", "/path/to/mcp-memory", "run", "mcp-memory"]
}
}
}
Replace /path/to/mcp-memory with the absolute path to the cloned repository.
Claude Desktop
Add to your Claude Desktop config file:
{
"mcpServers": {
"memory": {
"command": "uv",
"args": ["run", "mcp-memory"],
"cwd": "/path/to/mcp-memory"
}
}
}
Replace /path/to/mcp-memory with the absolute path to the cloned repository.
uvx (no clone required)
If you prefer not to clone the repo, run directly from GitHub:
{
"mcpServers": {
"memory": {
"command": "uvx",
"args": ["--from", "git+https://github.com/Yarlan1503/mcp-memory", "mcp-memory"]
}
}
}
:::caution
The uvx method does not support downloading the embedding model. If you need semantic search, clone the repository instead and follow the installation steps above.
:::
First steps
Create entities
Store knowledge as entities with a name, type, and observations:
{
"entities": [
{
"name": "My Project",
"entityType": "Project",
"observations": [
"Built with Astro and Starlight",
"Deployed on Vercel",
"Uses Pagefind for search"
]
}
]
}
If an entity already exists, create_entities merges observations instead of overwriting. Duplicates are discarded.
Link entities with relations
Connect entities with typed relationships:
{
"relations": [
{
"from": "My Project",
"to": "Astro",
"relationType": "uses"
},
{
"from": "My Project",
"to": "Vercel",
"relationType": "deployed_on"
}
]
}
Both entities must exist before creating a relation between them.
Search by substring
Find entities by keyword across names, types, and observation content:
{
"query": "project"
}
search_nodes uses LIKE pattern matching. It requires no embedding model and returns all entities whose name, type, or observations contain the query string.
Search by meaning
Find entities that are semantically related to your query, even without matching keywords:
{
"query": "web framework deployment",
"limit": 5
}
search_semantic encodes the query into a 384-dimensional vector and finds the nearest neighbors by cosine similarity. Results are re-ranked by the Limbic Scoring engine, which considers access frequency, recency, and co-occurrence patterns.
Split large entities automatically
Entities with many observations can be automatically split into focused sub-entities:
{
"entity_name": "My Project"
}
analyze_entity_split evaluates if an entity exceeds its type threshold (Sesion=15, Proyecto=25, otras=20) and uses semantic clustering (Agglomerative + c-TF-IDF fallback) to group observations into topics. If splitting is recommended, propose_entity_split returns suggested new entity names and the relations to create.
{
"entity_name": "My Project",
"approved_splits": [
{
"name": "My Project - Architecture",
"entity_type": "Project",
"observations": ["Stack: FastMCP + SQLite", "MCP Memory v2"]
}
]
}
execute_entity_split creates the new entities, moves observations, and establishes contiene/parte_de relations — all within an atomic transaction.
:::tip For the full entity splitting workflow and semantic clustering topic extraction details, see the Tools Reference page. :::
Without the model
The server works without the embedding model downloaded. Here’s what changes:
| Feature | Without model | With model |
|---|---|---|
create_entities | ✅ Works | ✅ Works + generates embedding |
create_relations | ✅ Works | ✅ Works |
add_observations | ✅ Works | ✅ Works + regenerates embedding |
delete_entities | ✅ Works | ✅ Works + removes embedding |
delete_observations | ✅ Works | ✅ Works + regenerates embedding |
delete_relations | ✅ Works | ✅ Works |
search_nodes | ✅ Works | ✅ Works |
open_nodes | ✅ Works | ✅ Works |
migrate | ✅ Works | ✅ Works + generates embeddings |
search_semantic | ❌ Error | ✅ Works |
find_duplicate_observations | ❌ Error | ✅ Works |
consolidation_report | ✅ Works | ✅ Works |
end_relation | ✅ Works | ✅ Works |
add_reflection | ✅ Works | ✅ Works + generates embedding |
search_reflections | ❌ Error | ✅ Works |
When the model is not available, search_semantic returns a clear error message instructing you to run the download script. All other tools function normally.
Next steps
- Architecture — understand the storage engine, embedding pipeline, and data flow
- Tools Reference — parameters, responses, and edge cases for all 19 tools
- Semantic Search — how vector search, hybrid retrieval, and Limbic Scoring work together
- Maintenance & Operations — deduplication, entity splitting, consolidation reports, and best practices
- Auto-tuning — optimize GAMMA and BETA_SAL via grid search