Google's unreleased Gemini Omni video model appeared in early Gemini app tests on May 12, 2026, just days before Google I/O 2026 scheduled for May 19-20. The leak reveals a unified multimodal model capable of handling video generation, remixing, and in-chat editing, distinguishing it from specialized video generators by integrating text, image, and video capabilities in a single system.
Discovery Timeline: From Hidden Code to Live UI
On May 2, 2026, a user discovered a UI string in Gemini's video generation tab reading "Start with an idea or try a template. Powered by Omni." By May 12, 2026, references to "Omni" appeared in the live Gemini app UI beyond hidden code, indicating progression past internal testing. Early testers identified the model ID as bard_eac_video_generation_omni and reported a 10-second generation limit for initial outputs.
Unified Model Across Text, Images, and Video
Unlike specialized video generators from competitors, Gemini Omni appears designed as a unified model handling multiple modalities. This architecture could provide consistent quality across text, images, and video with a single model maintaining stylistic consistency across outputs. The unified approach simplifies creative workflows by eliminating the need to switch between separate image and video models—one prompt could produce a cohesive image and video sequence.
Early testing revealed capabilities including video remixing, direct in-chat editing, and generation from simple text prompts. The 10-second generation limit suggests initial release constraints compared to competitors offering longer video outputs, though this may expand after launch.
Competitive Context: ByteDance Leads Benchmarks
ByteDance's Seedance 2.0 currently ranks at the top of most public video generation benchmarks, with Fast and Turbo variants making cinematic AI video financially viable for high-volume production. However, Seedance 2.0 and other leading models—including offerings from Runway, Pika, and Stability AI—function as specialized video generators without integrated text reasoning or image creation capabilities. If Gemini Omni successfully unifies these capabilities, it would occupy a distinct category as the first production-grade multimodal model spanning text, image, and video generation.
Expected Announcement at Google I/O 2026
Google I/O 2026 runs May 19-20, with Gemini and AI updates confirmed on the agenda. The timing of the Omni leak—appearing in live UI just one week before the conference—aligns with typical pre-launch testing patterns for major product announcements. Google has not officially confirmed Gemini Omni, but the progression from hidden code strings to visible UI elements suggests imminent public release.
Key Takeaways
- Google's Gemini Omni video model appeared in live Gemini app UI on May 12, 2026, days before Google I/O 2026 (May 19-20)
- Early testing reveals unified capabilities across text, images, and video in a single model, unlike specialized competitors
- The model ID bard_eac_video_generation_omni includes a 10-second generation limit in initial testing
- ByteDance's Seedance 2.0 leads current video generation benchmarks, but lacks Omni's integrated text and image capabilities
- Discovery timeline shows progression from hidden UI strings on May 2 to live interface elements by May 12, indicating imminent launch