Stability AI Unveils SDXL Turbo for Real-Time Text-to-Image Generation

The model uses a new technique called Adversarial Diffusion Distillation which allows it to have many advantages shared with Generative Adversarial Networks such as single-step image outputs, while avoiding artifacts or blurriness often observed in other distillation methods.

Image CRedit: Stability AI

Stability AI has released their latest innovation in text-to-image generation technology—SDXL Turbo. Using a novel technique called Adversarial Diffusion Distillation (ADD), SDXL Turbo can create detailed image outputs in real-time from short text prompts while maintaining high fidelity.

As detailed in Stability AI’s research paper, ADD enables SDXL Turbo to condense the text-to-image process down to just a single step, whereas previous models like SDXL 1.0 required 50 steps to output an image. This massive boost in efficiency significantly cuts down on the computational power and time needed to generate images without sacrificing visual quality.

Examples of images generated with SDXL Turbo

In comparative testing against other state-of-the-art diffusion models, human evaluators consistently ranked SDXL Turbo’s image outputs as higher quality while needing far fewer inference steps. With the ADD technique, SDXL Turbo combines strengths of both diffusion models and Generative Adversarial Networks, eliminating common issues like blurring or overly smooth images.

In practical terms, the speed of SDXL Turbo is remarkable. On an A100 GPU, the model can generate a 512x512 image in just over 200 milliseconds, a timeframe that includes prompt encoding, denoising, and decoding. The ADD distillation unlocks order-of-magnitude faster image generation compared to multi-step approaches, opening up new possibilities for real-time image creation applications utilizing natural language prompts.

However, the current release of SDXL Turbo does have some limitations worth noting. The images are fixed at 512x512 pixel resolution and the model cannot render legible text. Faces and human figures may not always generate properly. So while SDXL Turbo pushes rapid advances in text-to-image through its novel ADD technique, keep your expectations tempered.

If you are interested in trying the SDXL Turbo model, Stability AI has made it available for free on Clipdrop. The beta demo showcases the real-time text-to-image generation capabilities of the model and is accessible to most internet users.

SDXL Turbo is currently being released under a non-commercial research license that permits personal, non-commercial use. The model weights and code are available on Hugging Face.

