Between late March and early April 2026, four Chinese AI labs released open-weight coding models within a 12-day period, all achieving similar performance levels on agentic engineering benchmarks at significantly lower inference costs than Western frontier models. The coordinated releases from Zhipu AI, Moonshot, DeepSeek, and MiniMax represent a major milestone in AI sovereignty, with all major models built entirely without Nvidia hardware.
GLM-5.1 Achieves Top SWE-bench Pro Performance on Huawei Chips
Zhipu AI's GLM-5.1 topped the SWE-bench Pro benchmark as a 744 billion parameter Mixture-of-Experts model with 40 billion active parameters. The model was trained entirely on 100,000 Huawei Ascend 910B chips with zero Nvidia hardware dependency. GLM-5.1 can maintain autonomous task execution for up to 8 continuous hours without performance degradation—a capability that no Western closed-source model has publicly matched. The model achieved Tier A status (87/100) on real-world coding benchmarks, becoming the only Chinese model to reach that tier.
Kimi K2.6 Beats GPT-5.4 With Trillion-Parameter Architecture
Moonshot's Kimi K2.6 emerged as the surprise of April 2026, outperforming GPT-5.4 on SWE-Bench Pro despite being priced at just $0.60 per million output tokens. The 1 trillion-parameter vision-language model performs competitively with Qwen3.6 Max Preview and DeepSeek V4. Kimi K2.6 is designed to generate code in a plan-write-test-debug loop that can extend for days, and can instantiate hundreds of collaborative agents working on a single task simultaneously.
DeepSeek V4 Introduces Hybrid Attention Architecture at $0.14 Per Million Tokens
DeepSeek V4 competes with Claude Opus 4.6 and GPT-5.4 on coding benchmarks while costing just $0.14 per million tokens—representing a 67% discount compared to Western frontier model pricing. The model comes in two variants: V4-Pro (1.6T parameters, 49B active, $0.55/M input) and V4-Flash (284B parameters, 13B active, $0.14/M input). DeepSeek's architectural breakthrough combines Compressed Sparse Attention and Heavily Compressed Attention in a Hybrid Attention system that efficiently handles 1 million token contexts.
Strategic Coordination Signals New Competitive Tier
The tight 12-day release window and similar capability ceilings suggest strategic coordination among Chinese AI labs to establish a new competitive tier in coding AI. MiniMax's M2.7 rounded out the coordinated release wave, though details remain limited. All three major models (GLM-5.1, Kimi K2.6, DeepSeek V4) were developed without Nvidia hardware, demonstrating China's progress toward complete AI infrastructure independence. The models cost less than one-third of Claude Opus 4.7 while maintaining competitive performance on industry-standard benchmarks.
Key Takeaways
- Four Chinese AI labs released open-weight coding models within a 12-day window in late March to early April 2026, all achieving competitive performance at 67%+ lower cost than Western models
- GLM-5.1 topped SWE-bench Pro and can maintain autonomous execution for 8 continuous hours, trained on 100,000 Huawei Ascend 910B chips without any Nvidia hardware
- Kimi K2.6 beat GPT-5.4 on SWE-Bench Pro as a 1 trillion-parameter model priced at $0.60 per million output tokens, capable of instantiating hundreds of collaborative agents
- DeepSeek V4-Flash costs just $0.14 per million tokens while competing with Claude Opus 4.6 and GPT-5.4, using a Hybrid Attention architecture for efficient 1M token context handling
- All major releases were built without Nvidia hardware, representing a significant milestone in Chinese AI sovereignty and infrastructure independence