OpenAI Launches GPT-5.3-Codex-Spark on Cerebras Chips, Breaking Nvidia Monopoly

OpenAI announced GPT-5.3-Codex-Spark on February 12, 2026, a specialized variant optimized for ultra-fast code generation running on Cerebras' Wafer Scale Engine 3 chips rather than Nvidia GPUs. The model achieves over 1,000 tokens per second with sub-millisecond time between tokens, marking OpenAI's first major production deployment on non-Nvidia hardware and signaling a strategic move to diversify chip suppliers.

GPT-5.3-Codex-Spark Delivers 15x Speed Increase Over Standard Version

The new model variant achieves dramatically faster performance compared to existing coding models. GPT-5.3-Codex-Spark runs 15x faster than the standard GPT-5.3-Codex and 7x faster than Claude Opus 4.6's Fast Mode, with generation speeds exceeding 1,000 tokens per second. The system maintains a 128K context window but operates as text-only without multimodal capabilities, currently available in research preview for ChatGPT Pro users only.

Performance Trade-offs Balance Speed Against Accuracy

While delivering exceptional speed, the model makes strategic trade-offs in accuracy for certain tasks. On SWE-Bench Pro extra-high difficulty problems, Spark scores 51.5% compared to 56.8% for standard Codex. The system is optimized for quick, simple coding tasks rather than complex problem-solving, representing a deliberate design choice to serve developers who prioritize rapid iteration over maximum capability. The Codex platform has attracted over 1 million weekly active users according to OpenAI.

First Major OpenAI Deployment Outside Nvidia Infrastructure

This release represents OpenAI's first significant production deployment on non-Nvidia hardware, addressing growing concerns about Nvidia's pricing power and chip availability. By partnering with Cerebras, OpenAI demonstrates that alternative AI chip architectures can deliver competitive and in some metrics superior performance. The move follows industry-wide efforts to reduce dependence on Nvidia's GPU ecosystem and diversify AI infrastructure suppliers.

Key Takeaways

GPT-5.3-Codex-Spark runs on Cerebras WSE-3 chips and achieves over 1,000 tokens per second with sub-millisecond latency
The model is 15x faster than standard GPT-5.3-Codex and 7x faster than Claude Opus 4.6's Fast Mode
SWE-Bench Pro scores show 51.5% for Spark versus 56.8% for standard Codex on extra-high difficulty tasks
This marks OpenAI's first major production deployment on non-Nvidia hardware
The Codex platform has over 1 million weekly active users

GPT-5.3-Codex-Spark Delivers 15x Speed Increase Over Standard Version

Performance Trade-offs Balance Speed Against Accuracy

First Major OpenAI Deployment Outside Nvidia Infrastructure

Key Takeaways

GPT-5.3-Codex-Spark runs on Cerebras WSE-3 chips and achieves over 1,000 tokens per second with sub-millisecond latency

The model is 15x faster than standard GPT-5.3-Codex and 7x faster than Claude Opus 4.6's Fast Mode

SWE-Bench Pro scores show 51.5% for Spark versus 56.8% for standard Codex on extra-high difficulty tasks

This marks OpenAI's first major production deployment on non-Nvidia hardware

The Codex platform has over 1 million weekly active users