OpenSeeker-v2 Achieves State-of-the-Art Search Agent Performance Using Supervised Learning Only

Researchers at Shanghai Jiao Tong University released OpenSeeker-v2 on May 5, 2026, achieving state-of-the-art performance across four benchmarks using only supervised fine-tuning. The 30B-parameter agent reached 46.0% on BrowseComp, 58.1% on BrowseComp-ZH, 34.6% on Humanity's Last Exam, and 78.0% on xbench, surpassing Tongyi DeepResearch despite training on just 10,600 data points.

SFT-Only Approach Challenges Industry Resource-Intensive Methods

The research demonstrates that supervised fine-tuning alone can match or exceed systems trained with extensive continual pre-training and reinforcement learning pipelines. OpenSeeker-v2 outperformed Tongyi DeepResearch, which achieved 43.4%, 46.7%, 32.9%, and 75.0% on the same benchmarks using a heavy CPT+SFT+RL training pipeline. The team's approach suggests expensive reinforcement learning may not be necessary for competitive search agent performance.

Data Synthesis Modifications Drive Performance Gains

The researchers introduced three key modifications to their training data: scaling knowledge graph size for richer exploration paths, expanding tool set size for broader functionality, and implementing strict low-step filtering to establish stronger baselines. These modifications enabled the model to learn effective search strategies from a relatively small dataset of 10,600 informative, high-difficulty trajectories.

Academic Achievement Makes Frontier Research More Accessible

OpenSeeker-v2 represents the first state-of-the-art search agent within its model scale and ReAct paradigm to be developed by a purely academic team. The researchers open-sourced both model weights and their findings, making frontier search agent research accessible beyond organizations with massive computational resources. The paper was authored by Yuwen Du, Rui Ye, Shuo Tang, Keduan Huang, Xinyu Zhu, Yuzhu Cai, and Siheng Chen.

Key Takeaways

Shanghai Jiao Tong University achieved state-of-the-art search agent performance using only supervised fine-tuning, trained on 10,600 data points
OpenSeeker-v2 outperformed Tongyi DeepResearch across all four benchmarks, reaching 46.0% on BrowseComp and 78.0% on xbench
The research challenges the industry assumption that expensive continual pre-training and reinforcement learning are necessary for competitive performance
OpenSeeker-v2 is the first SOTA search agent in its class developed by a purely academic team
Model weights and research findings were open-sourced to democratize frontier search agent research

SFT-Only Approach Challenges Industry Resource-Intensive Methods

Data Synthesis Modifications Drive Performance Gains

Academic Achievement Makes Frontier Research More Accessible

Key Takeaways

Shanghai Jiao Tong University achieved state-of-the-art search agent performance using only supervised fine-tuning, trained on 10,600 data points

OpenSeeker-v2 outperformed Tongyi DeepResearch across all four benchmarks, reaching 46.0% on BrowseComp and 78.0% on xbench

The research challenges the industry assumption that expensive continual pre-training and reinforcement learning are necessary for competitive performance

OpenSeeker-v2 is the first SOTA search agent in its class developed by a purely academic team

Model weights and research findings were open-sourced to democratize frontier search agent research