Bias in Voice Recognition System

Picture of siri

Have you ever experienced Siri misunderstnading your speech? Do you feel that it weirdly misunderstand YOUR speech? Well, tha might not be your fault. SIRI actually DOES have bias based on ethnic/race. Throughout the semester in User Centered Research and Evaluation class, I have worked on Bias in Voice Recognition System.

David Greene, AI Summit 2021

Amazon Alexa showed 83.3% of Word Error Rate accuracy score in English while it showed 68.8% in German.

Harvard Business Review

Indian English has a 78% accuracy rate and Scottish English has a 53% accuracy rate

Claudia Lopez Lloreda

Training data is the issue and may have excluded dialects from African American Vernacular English (AAVE).

Based on my background research, we can clearly see that there are ethnic/baias bias in voice recognition system. It is not just a language that affects the accuracy, but the dialect is something that can affect heavily. So how can we implement reporting method in Voice Recognition System to prevent racial/ethnic bias?

In my research, I utilized mixed method approach including usability testing interviews and surveys to explore user experiences comprehensively and exploited speed dating and prototyping to investigate the design research for possible reporting method.

Mixed Method

- Usability Interview

- Survey

Design Research

- Speed Dating

- Prototyping

1. Usabililty Interview

I interviewed two active voice recognition users with Korean dialects. Two tasks were performed using a think-aloud protocol: 1) Ordering certain items through Alexa, and 2) Writing a text to a friend

Picture of usability interview

This was one of the evidence from usability interview, showing the interaction with participant who had Korean Accent. Even though participant iterated continuously, the result was unsatisfactory as shown. The participant commented that the problem happens frequently and it is annoying.

2. Survey

I have conducted survey with eight different questions targeted to a six people who have used a voice recognition system before. You can check the link here .

Result 1
Result 2

We can clearly see that more than half of the users agree that they feel like there is the difference in accuracy based on language and dialect. They also have differnet rate of satisfaction based on the languages.

For mixed-methods, we can know that users with specific dialects often encountered misunderstandings in Voice Recognition System

3. Speed Dating

For speed dating, I have utilized three storyboards based on the risk level. I started with a users’ need for simple reporting method that can report any inconvenience. To see whole storyboard, please check this slide . Interestingly, multiple users are willing to share private information for better user experience, which was one of the progressive risky scenario. One participant commented 'that even though any recording or geographical information should not be collected, this may be able to provide the most straightforward.'

4. Prototpying

For prototyping, I have recruited five participants who have used Siri before. I tested riskiest assumption that people are willing to sacrifice some of their private information for better user experience. Among four differnet private information, 1) lanugage setting 2) voice recording 3) ethincity 4) geographic location, I have inspected what users are willing to provide. To see whole protoype, please check this video .

Picture of prototype

From prototype, we can see that multiple people are positive about sharing their private information, regardless of the information. Some participants even pressed agree to all of them without reading specific terms.

From design research, I learned that users are willing to share private information for an enhanced user experience, contingent on simplified reporting.

For future directions, we should be aware that people sometime are not even aware about sharing their private information. We should clearly state information that we are collecting, and should provide simple methods for reporting including direct button or "report" keyword. Also, more diverse survey and hi-fi prototype will be necessary.

Through the session, I was able to learn proper way to conduct user research. I hope I can utilize this methods that I learned in industry, and when I start my own business.