A new open-source memory system for AI coding assistants has gained attention for its comprehensive performance benchmarks and verifiable claims. iai-mcp is a local server implementing the Model Context Protocol (MCP) that provides Claude and other AI assistants with long-term memory across conversations, automatically capturing context and serving relevant memories at the start of new sessions.
The Problem: AI Assistants Lack Persistent Memory
Current AI coding assistants require users to repeatedly provide context or explicitly request information from previous conversations. iai-mcp addresses this limitation through ambient capture and intelligent retrieval, eliminating manual context management. The system runs as a local Python daemon communicating via Unix sockets, with embeddings computed locally using bge-small-en-v1.5.
Performance Benchmarks Include Reproducible Test Harnesses
What distinguishes iai-mcp from other memory solutions is its emphasis on verifiable performance claims. The creator included comprehensive benchmarks with reproducible test harnesses in the repository, following the principle that "every claim ships with the harness that proves it." Key performance metrics include:
- Verbatim recall: ≥99% accuracy at 10,000 records
- Latency: p95 under 100ms for memory retrieval
- Memory footprint: 150-300MB RAM at steady state
- Encryption: AES-256-GCM at rest
Three-Tier Memory Architecture Balances Detail and Efficiency
The system implements three distinct memory tiers to optimize both accuracy and performance. The episodic tier stores verbatim, timestamped conversation fragments. The semantic tier contains AI-generated summaries from clustered episodes. The procedural tier maintains learned parameters about user preferences and working patterns.
This architectural approach allows the system to serve relevant context quickly while maintaining detailed records of past interactions. All processing happens locally, ensuring privacy and reducing latency compared to cloud-based solutions.
Key Takeaways
- iai-mcp provides persistent memory for AI coding assistants through automatic context capture and retrieval
- The system achieves ≥99% verbatim recall accuracy at 10,000 records with p95 latency under 100ms
- Three-tier memory architecture (episodic, semantic, procedural) balances detail retention with retrieval efficiency
- All processing runs locally with AES-256-GCM encryption, ensuring privacy without cloud dependencies
- Comprehensive benchmarks with reproducible test harnesses validate all performance claims