SearchSwarm Trains 30B Model to Delegate Research Tasks Instead of Expanding Context

Researchers published SearchSwarm on arXiv on June 8, 2026, demonstrating a scalable alternative to expanding context windows for long-horizon tasks. The approach trains a 30B parameter model with "delegation intelligence"—the ability to decompose complex research tasks, determine what to delegate to subagents, and integrate summarized results—achieving best-in-class performance among comparable-scale models.

Context Windows Cannot Accommodate Unbounded Research Tasks

The paper addresses a fundamental constraint: finite context windows cannot hold unbounded task context for deep research. Rather than requiring massive context windows, SearchSwarm demonstrates that models can learn to intelligently manage context through delegation.

The challenge lies in training data scarcity. Delegation intelligence—knowing what to decompose, when to delegate, and how to integrate results—rarely appears in naturally occurring text, making it difficult to train through standard methods.

Specialized Training Harness Generates Delegation Trajectories

The researchers designed a training harness that guides effective delegation and captures these decisions as supervised fine-tuning data:

Guides task decomposition: Directs models toward effective task breakdown strategies
Structures delegation: Constrains subagents to return properly formatted results
Encodes decisions: Captures correct delegation choices in trajectory data
Internalizes capability: Uses trajectories to train delegation intelligence into model weights

The harness-guided trajectories naturally encode correct delegation decisions, which serve as supervised fine-tuning data to internalize this capability into the model itself.

Best Results Among Comparable-Scale Models Without Pretraining

SearchSwarm achieved 68.1 on BrowseComp and 73.3 on BrowseComp-ZH, representing the best results among models of comparable scale. Notably, these results required no robot-data pretraining—the delegation intelligence came entirely from the harness-guided fine-tuning approach.

The researchers commit to releasing training data, model weights, and their harness to support future research. The work targets deep research as a representative long-horizon agent task, but the delegation approach applies broadly to complex real-world scenarios requiring extended reasoning.

Key Takeaways

SearchSwarm achieves 68.1 on BrowseComp and 73.3 on BrowseComp-ZH, best among comparable-scale models
The approach trains delegation intelligence through specialized harness-guided trajectories rather than expanding context windows
A custom training harness guides task decomposition, structures delegation, and captures correct decisions as fine-tuning data
The 30B parameter model required no robot-data pretraining to achieve best-in-class results
Training data, model weights, and the harness will be released to support future research on long-horizon agent tasks

Context Windows Cannot Accommodate Unbounded Research Tasks

Specialized Training Harness Generates Delegation Trajectories

The researchers designed a training harness that guides effective delegation and captures these decisions as supervised fine-tuning data:

Guides task decomposition: Directs models toward effective task breakdown strategies

Structures delegation: Constrains subagents to return properly formatted results

Encodes decisions: Captures correct delegation choices in trajectory data

Internalizes capability: Uses trajectories to train delegation intelligence into model weights

The harness-guided trajectories naturally encode correct delegation decisions, which serve as supervised fine-tuning data to internalize this capability into the model itself.

Best Results Among Comparable-Scale Models Without Pretraining

Key Takeaways

SearchSwarm achieves 68.1 on BrowseComp and 73.3 on BrowseComp-ZH, best among comparable-scale models

The approach trains delegation intelligence through specialized harness-guided trajectories rather than expanding context windows

A custom training harness guides task decomposition, structures delegation, and captures correct decisions as fine-tuning data

The 30B parameter model required no robot-data pretraining to achieve best-in-class results

Training data, model weights, and the harness will be released to support future research on long-horizon agent tasks