AI
Embracing Emotion: How Hume AI’s New Empathic Voice Interface Could Revolutionize Human-AI Interaction
To go back to this article, navigate to My Profile, and then look for the saved stories section.
Today marks the debut of an innovative technology that enables artificial intelligence to both express and perceive emotions. Introduced by Hume AI, a startup headquartered in New York, this "empathic voice interface" can imbue sizable language models from leading companies like Anthropic, Google, Meta, Mistral, and OpenAI with the ability to convey a spectrum of emotional expressions and to respond with emotional sensitivity, heralding a future where AI assistants could increasingly show their emotional side.
"At Hume AI, our expertise lies in crafting AI characters with empathetic traits that communicate like humans, steering clear of the typical AI assistant clichés," explains Alan Cowen, a psychologist and cofounder of Hume AI. Cowen, who has an extensive background in AI and emotion research and has contributed to numerous studies, formerly contributed his expertise to projects at Google and Facebook focusing on emotional technologies.
WIRED conducted an evaluation of Hume's newest vocal technology, named EVI 2, and discovered that its performance closely resembles that of OpenAI's ChatGPT. (In May, when OpenAI introduced a flirtatious tone to ChatGPT, CEO Sam Altman described the experience as resembling "AI from films." Subsequently, renowned actress Scarlett Johansson accused OpenAI of plagiarizing her voice.)
Similar to ChatGPT, Hume exhibits a greater degree of emotional responsiveness compared to the typical voice interaction systems. For instance, if you share the sad news of your pet's passing with it, Hume responds with an appropriately grave and empathetic demeanor. Additionally, similar to the way you can interject while ChatGPT is responding, you can also interrupt Hume while it's speaking, and it will halt and shift its response accordingly.
OpenAI has not disclosed the extent to which its voice interface attempts to analyze users' emotional states. However, Hume's interface is explicitly created for this purpose. When interacting with it, Hume's developer interface displays metrics reflecting aspects such as "determination," "anxiety," and "happiness" detected in the user's voice. If a user communicates with Hume in a sorrowful tone, it will recognize this, a capability that ChatGPT appears to lack.
Hume simplifies the process of setting up a voice with particular emotional tones through the inclusion of a prompt feature in its interface. This was evident when I requested it to adopt a “sexy and flirtatious” demeanor.
Hume AI's message described as "alluring and playful"
Instructed to adopt a "gloomy and sullen" demeanor:
Hume AI's "gloomy and downhearted" announcement
Here's the especially harsh response when requested to adopt an "angry and rude" tone:
Hume AI's "hostile and discourteous" communication
The tech wasn't always as refined or seamless as that of OpenAI, displaying erratic behavior at times. For instance, there was a moment when the voice unexpectedly accelerated, emitting nonsense. However, should the voice quality improve and become steadier, it could advance the prevalence and diversity of interfaces that mimic human speech.
The concept of identifying, quantifying, and replicating human emotions within technological frameworks has been explored for many years and falls under a discipline called "affective computing." This term was first coined by Rosalind Picard, who is a faculty member at the MIT Media Lab, during the 1990s.
Albert Salah, an academic at Utrecht University in the Netherlands with a focus on affective computing, finds the technology behind Hume AI remarkable and has even showcased it to his students. He notes, "EVI appears to assign emotional significance and levels of excitement [to the user], subsequently adjusting the agent's speech based on these parameters." He considers this approach to LLMs (Large Language Models) quite innovative.
Salah believes that the technology developed by Hume could be beneficial in both the marketing sector and in the field of mental health treatment. Nevertheless, he points out that individuals frequently mask their genuine feelings or alter their emotional state in the course of an interaction, which poses a challenge for artificial intelligence systems in accurately identifying their real emotions. He also raises questions regarding the effectiveness of this technology when applied to languages other than English and mentions the potential for slight biases that could lead to differential treatment of various accents. This is an issue Hume claims to have mitigated through the use of a diverse set of training data.
Cowen predicts a future where voice assistants have a much deeper understanding of our emotions, offering responses that seem sincerely empathetic when we're upset. As the number of AI-driven voice assistants increases, Cowen thinks it will be crucial for each to maintain a steady character and emotional resonance to foster user confidence. "We're going to interact with numerous AIs," he mentions. "Being able to identify them by their voice alone, in my opinion, will be incredibly significant for what's ahead."
Jess Hoey, a faculty member at the University of Waterloo specializing in the field of affective computing, emphasizes that Large Language Models (LLMs) merely simulate human emotions, as they genuinely lack the capacity to feel. "Artificial intelligence assistants will seem increasingly empathetic going forward, yet I believe their empathy won't be genuine," he states. "Moreover, I'm convinced that the majority of people will recognize this superficial facade."
Despite the absence of genuine emotions in the bot, engaging with users' feelings could pose potential dangers. OpenAI has acknowledged the importance of a cautious approach towards the development of ChatGPT's voice interface, engaging in studies to assess its potential addictiveness or influence. To ensure ethical development and deployment of its technology, Hume has initiated the Hume Initiative, incorporating advice from external specialists for ethical guidance and supervision.
Danielle Krettek-Cobb, who has served as a consultant for Hume and collaborated with Cowen during her time at Google, highlights that tech enterprises have been somewhat sluggish in exploring the emotional capabilities of technology. However, they must aim higher to develop devices that exhibit greater intelligence. "In my view, the crucial element of human intelligence lies in its social and emotional dimensions," she states. "This is our fundamental way of perceiving and connecting with our surroundings—it's our innate interface."
Explore Further…
Dive into the World of Politics: Subscribe to our newsletter and tune into our podcast.
The outcome of providing individuals with no-strings-attached cash
Weight loss isn't guaranteed for all Ozempic users
The Pentagon is seeking to allocate $141 billion towards the development of an apocalyptic device.
Gathering Notice: Be part of the Energy Technology Conference happening on the 10th of October, located
Additional Content from WIRED
Insights and Tutorials
© 2024 Condé Nast. All rights reserved. Purchases made through our site from products featured may result in a commission for WIRED, under our retailer affiliate program agreements. Content from this site is prohibited from being copied, distributed, transmitted, stored in a cache, or used in any form without explicit prior consent from Condé Nast. Choices regarding advertisements apply.
Choose a global website
Discover more from Automobilnews News - The first AI News Portal world wide
Subscribe to get the latest posts sent to your email.