Assessing Virtual Assistant Capabilities with Italian Dysarthric Speech

Fabio Ballati, Fulvio Corno, Luigi De Russis · 2018 · Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '18) · doi:10.1145/3234695.3236354

Summary

This paper evaluates how well three major smartphone virtual assistants — Apple's Siri, Google Assistant, and Microsoft's Cortana — can understand and respond to Italian dysarthric speech. Dysarthria is a motor speech disorder characterized by slurred, slow, or difficult-to-understand speech, commonly caused by neurological conditions. The study focused specifically on people with ALS (amyotrophic lateral sclerosis)-induced dysarthria, recruiting eight participants (ages 64-83, four male, four female) from the Otolaryngology department of Molinette Hospital in Turin, Italy. Participants had three types of dysarthria — flaccid, spastic, and unilateral upper motor neuron — with intelligibility levels of either "detectable speech disturbance" (6 participants) or "intelligible with repeating" (2 participants) on the ALS FRS-r scale. Each participant recorded 34 Italian sentences designed to cover common virtual assistant commands (weather, alarms, navigation, smart home) and include all Italian phonemes. The recordings were played to each assistant and evaluated on two dimensions: question comprehension (QC, measured by Word Error Rate) and consistency in answer (CiA, whether the assistant gave a coherent response).

Key findings

Performance varied dramatically across assistants. Google Assistant achieved the best speech recognition with an average Word Error Rate (WER) of 24.88%, compared to 39.39% for Cortana and 70.89% for Siri. The differences were statistically significant (F(2,14)=30.06, p<.01). However, WER was highly dependent on the individual user — one participant (M1) achieved 0% WER with Google Assistant while another (M2) had 63.92% WER with the same assistant. For qualitative question comprehension, Google Assistant properly transcribed 62% of sentences versus 40% for Cortana and only 15% for Siri. Siri failed to recognize 17.65% of all sentences entirely. For consistency in answer, both Google Assistant and Siri provided coherent answers approximately 54-60% of the time for properly transcribed questions, while Cortana defaulted to web searches 75.93% of the time rather than providing direct answers. A notable finding was that virtual assistants often stopped listening during pauses in dysarthric speech — the slower speech rate and frequent hesitations characteristic of dysarthria caused assistants to prematurely end recognition, resulting in incomplete transcriptions. Question comprehension appeared unrelated to the specific type of dysarthria, depending instead on the individual user's vocal characteristics.

Relevance

This study fills an important gap in accessibility research by examining virtual assistant usability for a non-English language and a specific clinical population. While virtual assistants are increasingly promoted as accessibility aids — enabling hands-free device control for people with motor disabilities — this research reveals that the same conditions causing motor impairments (like ALS) often also affect speech, creating a paradox where people who most need voice control may be least able to use it. The finding that Siri had a 70.89% WER underscores how far mainstream speech recognition still needs to improve for atypical speech patterns. For practitioners and developers, the study highlights several actionable issues: assistants should be more patient with slower speech (not cutting off during pauses), should offer adaptation to individual speech patterns, and should handle partial recognition more gracefully. The decision to publish the dysarthric speech dataset is also valuable, as the scarcity of training data for atypical speech in non-English languages is a major barrier to improving recognition accuracy.

Tags: speech recognition · dysarthria · virtual assistant · voice interface · ALS · speech accessibility · multilingual accessibility

Standards referenced: ALS Functional Rating Scale (ALS FRS-r)