Researchers published SearchSwarm on arXiv on June 8, 2026, demonstrating a scalable alternative to expanding context windows for long-horizon tasks. The approach trains a 30B parameter model with "delegation intelligence"—the ability to decompose complex research tasks, determine what to delegate to subagents, and integrate summarized results—achieving best-in-class performance among comparable-scale models.
Context Windows Cannot Accommodate Unbounded Research Tasks
The paper addresses a fundamental constraint: finite context windows cannot hold unbounded task context for deep research. Rather than requiring massive context windows, SearchSwarm demonstrates that models can learn to intelligently manage context through delegation.
The challenge lies in training data scarcity. Delegation intelligence—knowing what to decompose, when to delegate, and how to integrate results—rarely appears in naturally occurring text, making it difficult to train through standard methods.
Specialized Training Harness Generates Delegation Trajectories
The researchers designed a training harness that guides effective delegation and captures these decisions as supervised fine-tuning data:
- Guides task decomposition: Directs models toward effective task breakdown strategies
- Structures delegation: Constrains subagents to return properly formatted results
- Encodes decisions: Captures correct delegation choices in trajectory data
- Internalizes capability: Uses trajectories to train delegation intelligence into model weights
The harness-guided trajectories naturally encode correct delegation decisions, which serve as supervised fine-tuning data to internalize this capability into the model itself.
Best Results Among Comparable-Scale Models Without Pretraining
SearchSwarm achieved 68.1 on BrowseComp and 73.3 on BrowseComp-ZH, representing the best results among models of comparable scale. Notably, these results required no robot-data pretraining—the delegation intelligence came entirely from the harness-guided fine-tuning approach.
The researchers commit to releasing training data, model weights, and their harness to support future research. The work targets deep research as a representative long-horizon agent task, but the delegation approach applies broadly to complex real-world scenarios requiring extended reasoning.
Key Takeaways
- SearchSwarm achieves 68.1 on BrowseComp and 73.3 on BrowseComp-ZH, best among comparable-scale models
- The approach trains delegation intelligence through specialized harness-guided trajectories rather than expanding context windows
- A custom training harness guides task decomposition, structures delegation, and captures correct decisions as fine-tuning data
- The 30B parameter model required no robot-data pretraining to achieve best-in-class results
- Training data, model weights, and the harness will be released to support future research on long-horizon agent tasks