Publication Year
0
Summary

This study presents CDLI’s community-driven methodology for the creation of an impaired speech corpus in a low-resource language (LRL), specifically Akan, spoken by around 22 million people, or 80% of the population in Ghana. The project adapted an open-source data collection app, incorporating both image and text prompts appropriate for people living with impaired speech. Data collection involved in-person and virtual methods, with speech and language therapist screening of potential participants based on speech severity and cognitive skills. Thirty hours of audio data were collected from people living with cerebral palsy, stammering, and cleft palate. The paper discusses the challenges encountered in data collection and transcription of Akan – with its still evolving writing system. The paper explores adaptation of the open-source Whisper model by fine-tuning a base Akan model (trained on approximately 100 hours of unimpaired speech in Akan and using the collected impaired speech data. Initial results demonstrate a median relative WER reduction of 21.7% on the impaired speech test set, highlighting the significant performance gap of standard ASR on disordered speech (baseline median WER of 84.6%). The study identifies data quality and transcription inconsistencies as key areas for future improvement. The resulting dataset, cookbook, and open-source tools will be publicly available.

Associated Organisation
Publication Status
In Review