OpenAI

OpenAI Begins Rollout of its Advanced Voice Mode

July 30, 2024 • 2 min read

Remember that mind-blowing demo of the new ChatGPT voice capabilities that OpenAI showed off back in May? Well, it's no longer just a demo. The company has started rolling out its "Advanced Voice Mode" to a select group of ChatGPT Plus users. If you're not one of the lucky few that the OpenAI gods have smiled upon, don't worry - all Plus users should get access by fall.

The new voice mode brings more natural, real-time conversations with the AI. Users can interrupt at any point, and the system is designed to sense and respond to emotions. It's a far cry from the rigid, turn-based interactions we've grown accustomed to with virtual assistants like Siri and Alexa.

In announcing the rollout today, OpenAI's CTO, Mira Murati, says they've found this technology to be more collaborative and helpful. But the path to today's release hasn't been smooth.

The company's original demo featured a voice called "Sky" which drew attention for what some considered to be an uncanny vocal resemblance to actress Scarlett Johansson. The actress denied any involvement with the project, shared that she had previously refused permission for her likeness to be used, and threatened legal action.

While OpenAI quickly denied the accusations of wrongdoing, and took the time to detail how their voices were developed, they nevertheless removed the controversial Sky voice as an option. For now, the system will use the four other preset voices that were also created with paid voice actors - Breeze, Cove, Ember, and Juniper.

Following this saga, OpenAI announced a delay in the launch of Advanced Voice Mode to conduct further safety testing and improve the model's ability to detect and refuse certain content. This includes adding built-in filters to block requests that could generate copyrighted audio and it's iterative deployment strategy.

This cautious approach reflects the broader challenges facing AI companies as they push the boundaries of what's possible. As AI gains more advanced multimodal capabilties and becomes more human-like in its interactions, issues of consent, copyright, and ethical use will become more critical.

While it's frustrating to have to wait (I know 😭), this rollout underscores a crucial point: when it comes to deploying advanced AI technology, an abundance of caution isn't just advisable—it's essential. As one of the frontier model providers, OpenAI has a moral imperative to balance their drive for innovation with responsible development and deployment.

And while the community has been critical of OpenAI's various delayed releases (ahem Sora, GPT-4o with video, GPT-5), we must also appreciate that this is a delicate tightrope to walk. Moreover, as a society, we need time. Time to increase AI literacy—for individuals, businesses and communities. Time to update our outdated laws. Time to develop governance, oversight and safeguards. And time to consider the role that we want AI to play in our lives.

So, while today's announcement means a more engaging AI experience is on the horizon, it also serves as a reminder of the complex landscape we're navigating as AI becomes more capable and plays an increasingly integral role in our work and personal lives.

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.

An Exclusive Leadership Retreat

Leading in the Intelligence Age

OpenAI Begins Rollout of its Advanced Voice Mode