Google

Google Unveils Imagen 2, Its Most Advanced Text-to-Image AI Yet

December 13, 2023 • 3 min read

Google has taken the wraps off Imagen 2, the latest iteration of its text-to-image AI model. This new model promises even more realistic and detailed image generation capabilities through advanced neural network techniques.

Unlike OpenAI's DALL-E3 or tools from Adobe and Midjourney, Google is focused on providing Imagen 2 as an API for developers rather than a standalone consumer app.

Building on the existing Imagen API launched last year, Imagen 2 boasts significantly improved image quality and understanding of text prompts. Through changes to its training data and methodology, Imagen 2 generates higher resolution, more aesthetically pleasing images that closely match the descriptions provided.

A visualization of how Imagen 2 makes it easier to control the output style by using reference images alongside a text prompt.

Specifically, Google enhanced the image captions used to train Imagen 2, helping the model better grasp elements like context and nuance. Additional training focused on improving Imagen 2’s rendering of challenging areas like hands, faces, and minimizing visual artifacts. The company also employed an image quality scoring system to further refine outputs.

Imagen 2 introduces other new features to enable more control over image properties. Users can now provide style reference images, with Imagen 2 adopting requested styles like lighting, textures, and palettes.

Imagen 2 can generate new content directly into the original image with inpainting.

The API also gains advanced inpainting and outpainting capabilities for inserting generated content into existing images or expanding images beyond their borders.

Imagen 2 can extend the original image beyond its borders with outpainting.

Multilingual support allows prompting and outputs spanning 7 languages so far, with more to come. Imagen 2 can even render text within images in the appropriate language.

This opens rich branding and localization possibilities. Logo generation allows users to conjure custom logos which can then be neatly integrated into other media.

Google says responsibility remains central to Imagen 2’s development. Before launch, the company conducted safety testing around sensitive categories to avoid issues. Imagen 2 also connects with Google’s SynthID tool to imperceptibly watermark AI-generated images at the pixel level, enabling authentication and tracing.

For allowlisted paying customers, Imagen 2 is now available via Google’s Vertex AI platform. The launch already counts major creative brands like Snap, Shutterstock, and Canva among early adopters.

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.

An Exclusive Leadership Retreat

Leading in the Intelligence Age

Google Unveils Imagen 2, Its Most Advanced Text-to-Image AI Yet