Hume Unveils EVI 2, its New Voice-to-Voice Foundation Model

By Chris McKay September 11, 2024 • 1 min read

Hume has introduced EVI 2, its new voice-language foundation model that merges speech and text processing into a single, powerful system. This new model represents a significant advancement in AI-driven conversational technology, offering enhanced naturalness, emotional responsiveness, and rich customization options.

EVI 2 builds on its predecessor by dramatically improving voice quality and reducing response times. The model can now engage in remarkably human-like conversations with subsecond latency, averaging around 500ms – a 40% reduction compared to EVI 1. This speed boost enables more fluid, natural interactions between users and the AI.

A standout feature of EVI 2 is its advanced emotional intelligence. By processing voice and language simultaneously, the model can better understand the emotional context of user inputs and generate empathic responses in both content and tone. This capability allows EVI 2 to adapt its personality and speaking style to suit different applications and user preferences.

Developers can now fine-tune EVI 2's voice characteristics along various parameters such as gender, nasality, and pitch. This customization feature, which doesn't rely on potentially risky voice cloning technology, allows for the creation of unique voices tailored to specific apps or users.

The new model also brings cost benefits, with a 30% reduction in pricing compared to its predecessor. This makes EVI 2 an attractive option for developers looking to integrate advanced voice AI into their applications.

While EVI 2 is currently available in beta, Hume plans to continue improving the model's reliability, language support, and instruction-following capabilities in the coming weeks. The company is also working on a larger version of the model, EVI-2-large, which will be announced soon.

As voice AI technology continues to evolve, EVI 2 represents a significant step towards more natural, emotionally intelligent, and personalized AI interactions. Its potential applications span various industries, from customer service to entertainment, promising to reshape how we interact with AI in our daily lives.

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.