The number of people using digital assistants are growing by the day, and the increasing popularity has led to predictions of as many as 75% of US households owning smart speakers by 2020 according to Gartner. Within this expansive growth, there are several brands of assistants, including Amazon Alexa, Google Home, and Microsoft Cortana, taking the lead. Their offerings contain many similarities, and of course differences too, but when it comes to the obvious characteristics – what do these devices all have in common? The voice behind the technology, in each device, is female.
A recent study questioning the design of artificial intelligence revealed that out of almost 12,000 people from 139 countries, 44% prefer their digital assistant to be gender neutral. However, when broken down into gender, 36% of men thought the assistant should be gender neutral, contrasting to 62% of women. While most assistants offer voices of either gender, the default is female – and is lacking a gender neutral option completely. This opens up the question: why?
Tech companies are beginning to recognize the parallel between the voice – whether female or male – to the role of the assistant itself as they become more ubiquitous. Alexa, Siri, Cortana, and Google Assistant are all synthesized versions of a woman – required to answer questions and demands in a polite manner. On the other hand, IBM’s Watson is male, holding a higher role of leadership and knowledge, compared to its female counterparts. These preferences and the difference can be linked to norms tied to tradition or other cultural values, furthering gender bias.
Examining this concern and contemplation regarding why the default voice is female, many AI companies have, or are, considering moving towards a more inclusive design. While the bias within voice AI is seemingly present, how is speech recognition, natural language processing, text to speech, and voice biometrics technologies impacted with the same bias?
Each of these technologies require large amounts of data for machine learning – and male voices are dominating these datasets. Women’s voices for speaker recognition systems are harder to recognize because training data has more male voices. So for Pindrop’s Deep Voice™ biometric engine – how does gender impact accuracy?
We take great care in balancing gender in our data set evaluation. Today we’re working with the largest banks globally. When you take a look at what that means in the US – the top banks account for 60% of the US population demographic, which is evenly split between men and women.
Artificial intelligence is destined to power some of our most important services, but there’s growing concern that it could repeat much of the prejudice that humans have about race, gender and more because of the way it’s built. There’s a lot of work from the major players to evaluate these systems and remove prejudice from AI.