Last month, Meta AI unveiled Audiobox, a powerful new foundation model for custom audio generation. Audiobox allows users to create customized voices, sound effects, and more using simple text prompts and voice inputs.
Today, the company launched a new interactive website to enable hands-on experimentation with Audiobox, and announced a research grant that invites external collaborations to ensure the technology’s responsible development.
To showcase Audiobox's versatility, the website provides demos for six different Audiobox features that users to play with:
- Your Voice: Generate speech in the style of any audio sample
- Described Voices: Generate speech with novel voice styles using text description
- Restyled Voices: Change the style of any audio sample using text description (Combining Your Voice and Described Voices)
- Sound Effects: Generate sound effects from a text description
- Magic Eraser: Erase noise from speech recordings
- Sound Infilling: Replace a portion of your audio with new sounds
The website also offers a very cool "Audiobox Maker" creative space for constructing custom audio stories using a combination of the model's capabilities. This tool essentially allows you to create multi-track audio file with one or more voice narrators as well as sound effects.
To get started, you must register a new voice to add speech to your story by either choosing to describe the characteristics of the voice (as well as the environment and any accompanying sounds) or recording your own.
If you choose to record your own voice, the system generates a prompt that you must read within 50 seconds. You can add sound effects to your story by simply describing the characteristics of the sound you would like to create. You can also browse and re-use saved voices to maintain consistency for their story's characters.
Along with the demo website, Meta has concurrently announced the Audiobox Responsible Generation Grant. This program will provide research teams with funding and access to Audiobox for conducting research into safety, fairness, ethics and other priority issues surrounding generative audio tech.
The Fundamental AI Research group at Meta is accepting applications for up to 10 grants worth up to $50,000 each. The deadline to apply is February 2nd, 2024. Teams focused specifically on responsible and ethical issues are strongly encouraged to apply.
In the request for proposals, Meta stated "the known issues with AI cannot be addressed by any individual or single organization alone. That's why collaboration with the research community on state-of-the-art models is more important now than ever."
The releases of Audiobox and the accompanying interactive demos represent massive steps towards democratized audio content creation. As Meta notes, producing high quality, customized audio currently requires extensive expertise and resources. By lowering these barriers, Audiobox enables new creative possibilities for fields like film, gaming, podcasts and more.
With robust safeguards like watermarking and authentication in place, Audiobox promotes innovation through academic partnerships while preventing potential harms. Generative audio technology remains in its infancy, but holds enormous creative potential if nurtured responsibly. Meta's latest announcements unlock these possibilities for both professional creators and casual users alike.