Google Media & Entertainment

Google Announces Veo 3, Imagen 4, and Next-Gen AI Media Tools

May 20, 2025 • 3 min read

Google today unveiled a suite of upgraded generative AI models that push the boundaries of AI-created media, w the ability to generate audio and video simultaneously for the first time.

Key Points

Veo 3 can now create videos with synchronized audio, including ambient sounds and dialogue
Flow combines Google's AI models into a filmmaking tool that maintains character and scene consistency
All generated content includes invisible SynthID watermarks to identify AI-created media

At its annual I/O developer conference, Google showed off major upgrades to its creative AI arsenal with Veo 3, Imagen 4, Flow, and Lyria 2 — a comprehensive suite of tools aimed at transforming how creators produce visual and audio content.

The most significant update comes from Veo 3, Google's latest video generation model. Unlike previous iterations, Veo 3 can generate not just video but also accompanying audio. The model can add background sounds like city traffic or birds chirping, and even generate dialogue with synchronized lip movements.

There's also Flow, a new AI filmmaking tool that combines the capabilities of Veo, Imagen, and Gemini. Flow moves beyond basic text-to-video generation by focusing on storytelling continuity, allowing creators to maintain consistent characters and scenes across multiple clips. The platform includes camera controls for precise movement, a scene builder for extending shots, and asset management tools for organizing creative elements.

Imagen 4, its newest image generation model, delivers improved detail rendering, handling everything from "intricate fabrics" to "water droplets" with remarkable clarity. The model also significantly improves typography, making it practical for creating text-heavy content like greeting cards and posters.

Music creators haven't been left out. Lyria 2, which powers Music AI Sandbox, gives musicians and producers tools to explore unique musical ideas. Google has also launched Lyria RealTime, an interactive music generation model that lets users create and control music in real time.

Addressing concerns about AI-generated content, Google announced all outputs from these models will include SynthID watermarks — invisible markers embedded at the pixel, audio frame, or text level. The company will also launch SynthID Detector, allowing anyone to verify if content was created using Google's AI tools.

Flow and Veo 3 are available immediately to Google AI Pro and Ultra plan subscribers in the U.S., with Imagen 4 accessible through the Gemini app, Whisk, Vertex AI, and various Google Workspace applications. The Ultra plan, priced at a substantial $249.99 per month, gives subscribers the highest usage limits and early access to Veo 3's audio generation capabilities.

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.

An Exclusive Leadership Retreat

Leading in the Intelligence Age

Google Announces Veo 3, Imagen 4, and Next-Gen AI Media Tools

Key Points