iai-mcp: Benchmarked Memory System for AI Coding Assistants Reaches 104 Stars

A new open-source memory system for AI coding assistants has gained attention for its comprehensive performance benchmarks and verifiable claims. iai-mcp is a local server implementing the Model Context Protocol (MCP) that provides Claude and other AI assistants with long-term memory across conversations, automatically capturing context and serving relevant memories at the start of new sessions.

The Problem: AI Assistants Lack Persistent Memory

Current AI coding assistants require users to repeatedly provide context or explicitly request information from previous conversations. iai-mcp addresses this limitation through ambient capture and intelligent retrieval, eliminating manual context management. The system runs as a local Python daemon communicating via Unix sockets, with embeddings computed locally using bge-small-en-v1.5.

Performance Benchmarks Include Reproducible Test Harnesses

What distinguishes iai-mcp from other memory solutions is its emphasis on verifiable performance claims. The creator included comprehensive benchmarks with reproducible test harnesses in the repository, following the principle that "every claim ships with the harness that proves it." Key performance metrics include:

Verbatim recall: ≥99% accuracy at 10,000 records
Latency: p95 under 100ms for memory retrieval
Memory footprint: 150-300MB RAM at steady state
Encryption: AES-256-GCM at rest

Three-Tier Memory Architecture Balances Detail and Efficiency

The system implements three distinct memory tiers to optimize both accuracy and performance. The episodic tier stores verbatim, timestamped conversation fragments. The semantic tier contains AI-generated summaries from clustered episodes. The procedural tier maintains learned parameters about user preferences and working patterns.

This architectural approach allows the system to serve relevant context quickly while maintaining detailed records of past interactions. All processing happens locally, ensuring privacy and reducing latency compared to cloud-based solutions.

Key Takeaways

iai-mcp provides persistent memory for AI coding assistants through automatic context capture and retrieval
The system achieves ≥99% verbatim recall accuracy at 10,000 records with p95 latency under 100ms
Three-tier memory architecture (episodic, semantic, procedural) balances detail retention with retrieval efficiency
All processing runs locally with AES-256-GCM encryption, ensuring privacy without cloud dependencies
Comprehensive benchmarks with reproducible test harnesses validate all performance claims

The Problem: AI Assistants Lack Persistent Memory

Performance Benchmarks Include Reproducible Test Harnesses

Verbatim recall: ≥99% accuracy at 10,000 records

Latency: p95 under 100ms for memory retrieval

Memory footprint: 150-300MB RAM at steady state

Encryption: AES-256-GCM at rest

Three-Tier Memory Architecture Balances Detail and Efficiency

Key Takeaways

iai-mcp provides persistent memory for AI coding assistants through automatic context capture and retrieval

The system achieves ≥99% verbatim recall accuracy at 10,000 records with p95 latency under 100ms

Three-tier memory architecture (episodic, semantic, procedural) balances detail retention with retrieval efficiency

All processing runs locally with AES-256-GCM encryption, ensuring privacy without cloud dependencies

Comprehensive benchmarks with reproducible test harnesses validate all performance claims