Y Combinator Winter 2026 startup Compresr has released Context Gateway, an open-source agentic proxy that compresses tool outputs before they enter an AI model's context window. The tool launched on Hacker News on March 13, 2026, and has accumulated 307 GitHub stars, addressing a critical problem in AI agent workflows: context bloat that degrades model accuracy and increases costs.
The Context Problem Agents Face
AI agents struggle with context management as single operations like file reads or grep commands can inject thousands of tokens into the context window, most of which are irrelevant noise. Long-context benchmarks reveal this problem's severity — OpenAI's GPT-5.4 evaluation shows accuracy dropping from 97.2% at 32,000 tokens to just 36.6% at 1 million tokens.
How Context Gateway Works
Context Gateway uses small language models (SLMs) to intelligently compress tool outputs based on the agent's intent. The system examines model internals and trains classifiers to identify which context portions carry the most signal. When an agent calls grep searching for error handling patterns, the SLM retains relevant matches while stripping unnecessary content.
The proxy includes several key features:
- Intent-conditioned compression that preserves contextually relevant information
- An expand() function that retrieves original outputs if the model needs removed content
- Background compaction triggered at 85% window capacity
- Lazy-loading of tool descriptions, showing only relevant tools for current tasks
Technical Approach and Performance
The system employs strided block-parallel sampling to generate multiple rollouts from nested prefixes concurrently. It batches feature extraction over these rollouts and uses resulting embeddings for on-policy policy-gradient updates. The team connects their method theoretically to KL-regularized feature-matching and energy-based modeling.
Across question-answering coding, unstructured coding, and translation tasks, Context Gateway's energy-based fine-tuning approach matches RLVR performance and outperforms standard supervised fine-tuning on downstream accuracy while achieving lower validation cross-entropy.
Integration and Adoption
Context Gateway supports integration with Claude Code, Cursor IDE, OpenClaw, and custom configurations. Installation requires a single command: curl -fsSL https://compresr.ai/api/install | sh. The tool includes spending caps, a dashboard for tracking sessions, and Slack notifications when agents await user input.
The project has attracted an active Discord community and shows 52 commits across 12 releases with 7 contributors. The repository includes 29 forks, indicating developer interest in adapting the technology for specific use cases.
Key Takeaways
- Context Gateway uses small language models to compress agent tool outputs before they reach the main LLM, reducing context bloat by up to 90%
- OpenAI's GPT-5.4 accuracy drops from 97.2% to 36.6% as context grows from 32k to 1M tokens, demonstrating the severity of the context management problem
- The system performs intent-conditioned compression, preserving only context relevant to the agent's specific tool call purpose
- Context Gateway integrates with Claude Code, Cursor IDE, and OpenClaw, with 307 GitHub stars since its March 2026 launch
- The tool includes background compaction at 85% window capacity and an expand() function to retrieve original outputs when needed