Voice recognition systems are sexist; they struggle to deal with female voices compared to male ones. The issue isn’t new, but it was mentioned again in a recent blog post by Delip Rao, CEO and co-founder of R7 Speech Sciences, a startup using AI to understand speech.
And with the rise of voice-activated digital assistants like Apple’s Siri, or Amazon’s Alexa, or Google Home it’s an important problem to raise. “In speech, we measure the mean fundamental frequency (which correlates with our perception of “pitch”).
This is also called mean F0. The range of tones produced by our vocal tract is a function of the distribution around that,” according to Rao.
“You could write a simple, rule-based, gender classifier if you had the mean F0 from audio. From many sources, we know the mean F0 for men is around 120Hz and much higher for women (~200Hz).”
Rachael Tatman, a data scientist at Kaggle and a PhD linguistics graduate from the University of Washington, explained to The Register, that it doesn’t just stem from neural networks learning from the lack of training examples for female voices. It’s an inherent technical problem down to the fact that females generally have higher pitched voices. Read more from theregister.co.uk…
thumbnail courtesy of theregister.co.uk