Andrej Karpathy released autoresearch on March 7, 2026, a framework that allows AI agents to autonomously improve machine learning training code. The system strips down LLM training to a single-file, 630-line Python codebase where humans write high-level strategy prompts while AI agents iteratively modify the training code based on performance metrics.
How Autoresearch Works
The framework operates on a simple division of labor: humans maintain a program.md file with research strategy, while an AI agent modifies train.py based on validation loss results. The system runs on a single GPU and can perform hundreds of experiments overnight, proposing changes and keeping improvements that reduce validation loss.
Karpathy's implementation includes:
- Single-file architecture requiring only one GPU
- Autonomous experiment execution with automatic metric tracking
- Systematic testing of architectural changes, optimization tweaks, and hyperparameter adjustments
- Ability to run continuously for days without human intervention
Real-World Results Demonstrate Scalability
In his own testing, Karpathy left autoresearch running for approximately two days on a depth=12 model. The system discovered around 20 changes that improved validation loss. When tested on March 9, 2026, all 20 improvements proved additive and successfully transferred to larger depth=24 models, demonstrating that discoveries made on smaller models can scale to production systems.
Shopify CEO Tobi Lütke confirmed running autoresearch internally with positive results, indicating early enterprise adoption.
Community Adoption and Extensions
The GitHub repository gained 26,237 stars within six days of its March 6, 2026 creation. The community quickly developed extensions, including autoresearch-mlx with 537 stars, which enables autonomous research on Apple Silicon Macs.
One user reported running autoresearch for over 11 hours with an AI agent positioned as "chief scientist of an AI lab with 8 GPUs," completing 568 parallel experiments with the agent autonomously deciding next steps.
Vision for Collaborative AI Research
Karpathy outlined his next-step vision on March 8, 2026: making autoresearch "asynchronously massively collaborative for agents" in a SETI@home-style distributed system. The goal shifts from emulating a single PhD student to emulating an entire research community, suggesting a future where AI agents collectively advance machine learning research.
Key Takeaways
- Autoresearch is a 630-line Python framework enabling AI agents to autonomously improve ML training code on single GPUs
- Karpathy's testing found 20 improvements over two days that all transferred successfully to larger models
- The GitHub repository gained 26,237 stars in six days, with community ports already enabling Mac-based research
- Shopify confirmed internal deployment with strong results, demonstrating enterprise readiness
- Karpathy envisions scaling to a distributed network of collaborative AI research agents