Stable Diffusion XL (SDXL), the latest breakthrough in AI image generation from leading startup Stability AI, represents a major leap forward in creating astonishingly high-quality and creative artwork using natural language prompts.
The unveiling of this much-anticipated model comes on the heels of months of community engagement and testing, with the AI community's feedback playing a pivotal role in shaping the final product, now available on Github, DreamStudio, API, Clipdrop, and Amazon SageMaker JumpStart, and Amazon Bedrock.
The newly released SDXL 1.0 model sets a new high bar for AI art across metrics including overall image quality, freedom of expression, concept rendering, simplicity of prompting, and accuracy.
SDXL achieves superior photorealism, artistic style replication across genres, and handling of difficult concepts like hands and text. This enables prompting with just a few words to yield intricate, aesthetically pleasing images that accurately match the desired vision.
SDXL's enhancements stem from improvements to the CLIP model used alongside the Diffusion model. This strengthened ability to comprehend text means SDXL grasps nuanced distinctions, allowing more control directly from prompts without needing advanced techniques. Users can easily produce very specific creative visions without an imposed style.
The release fulfills Stability AI's goal of driving AI art technology forward through transparency and democratization. With models available through GitHub, APIs, apps, and cloud platforms, professionals and hobbyists alike can access SDXL's step change capabilities. Stability also plans additional features like controlnets to further unlock SDXL's potential.
Stability AI's alliance Amazon Web Services (AWS) has been instrumental in this endeavor. In 2022, Stability AI chose AWS as its preferred cloud provider, building its foundation models on Amazon SageMaker. With the April launch of Amazon Bedrock, Stability AI's Stable Diffusion was among the first foundation models to be made available.
"Model choice is paramount to maximize the value customers get from generative AI," said Swami Sivasubramanian, Vice President of Database, Analytics, and Machine Learning at AWS. "By expanding Amazon Bedrock's selection with the addition of Stability AI's SDXL 1.0 model, we're giving customers access to a state-of-the-art text-to-image model to build and scale exciting, new generative AI applications."
Some of the most exciting features of SDXL include:
- The highest quality text to image model: SDXL generates images considered to be best in overall quality and aesthetics across a variety of styles, concepts, and categories by blind testers. Compared to other leading models, SDXL shows a notable bump up in quality overall.
- Freedom of expression: Best-in-class photorealism, as well as an ability to generate high quality art in virtually any art style. Distinct images are made without having any particular ‘feel’ that is imparted by the model, ensuring absolute freedom of style
- Enhanced intelligence: Best-in-class ability to generate concepts that are notoriously difficult for image models to render, such as hands and text, or spatially arranged objects and persons (e.g., a red box on top of a blue box) Simpler prompting: Unlike other generative image models, SDXL requires only a few words to create complex, detailed, and aesthetically pleasing images. No more need for paragraphs of qualifiers.
- More accurate: Prompting in SDXL is not only simple, but more true to the intention of prompts. SDXL’s improved CLIP model understands text so effectively that concepts like “The Red Square” are understood to be different from ‘a red square’. This accuracy allows much more to be done to get the perfect image directly from text, even before using the more advanced features or fine-tuning that Stable Diffusion is famous for.
- All of the flexibility of Stable Diffusion: SDXL is primed for complex image design workflows that include generation for text or base image, inpainting (with masks), outpainting, and more. SDXL can also be fine-tuned for concepts and used with controlnets. Some of these features will be forthcoming releases from Stability.
Today's announcement follows a series of corporate milestones at Stability AI, including the launch of its new developer platform site and the release of Stable Doodle, a sketch-to-image tool that generated more than 3 million images in the week following its release.
While ethical concerns remain about AI art, Stability hopes community-oriented development will lead to responsible progress. With SDXL, Stability AI has achieved an impressive balancing act - an accessible yet extraordinarily capable model, and an open yet guided approach. Their community and engineering execution prowess has yielded another AI milestone. The path ahead looks bright for continued enhancement of what humans and AIs can create together.