An End-to-End Approach for Child Reading Assessment in the Xhosa Language
Project Overview
The document explores the innovative use of generative AI in the field of education, specifically through the development of an end-to-end approach for assessing reading abilities in children speaking low-resource languages like Xhosa. It underscores the critical role of literacy in a child's future success and highlights how AI can enhance and automate reading assessments. Central to this research is the application of automatic speech recognition (ASR) technology to detect pronunciation challenges faced by young readers. The study introduces a unique dataset comprising Xhosa child speech samples and rigorously evaluates the performance of advanced ASR models. Despite the promising potential of these technologies, the findings point to significant hurdles, particularly concerning data scarcity and the distinct acoustic characteristics of children's voices. Ultimately, the document illustrates the transformative potential of generative AI in education while acknowledging the complexities involved in implementing these technologies effectively in low-resource contexts.
Key Applications
Automatic Speech Recognition (ASR) for reading assessment
Context: Assessment of children's reading skills in low-resource languages (Xhosa) for early grades in schools in South Africa.
Implementation: Developed a dataset of 14,971 recordings of children pronouncing Xhosa words, labeled by multiple markers and validated by an expert.
Outcomes: Improved assessment accuracy and tracking of children's reading progress; automation reduces human error and effort.
Challenges: Limited training data for low-resource languages; performance of ASR models can be affected by noise and pronunciation variability.
Implementation Barriers
Data Quality
Limited availability of training data for low-resource languages impacts the effectiveness of ASR models, while challenges in capturing high-quality recordings due to classroom noise and variability in children's speech hinder data collection.
Proposed Solutions: Utilization of self-supervised learning models to reduce reliance on large labeled datasets, along with implementation of consensus-based labeling to improve data quality and validation methods.
Project Team
Sergio Chevtchenko
Researcher
Nikhil Navas
Researcher
Rafaella Vale
Researcher
Franco Ubaudi
Researcher
Sipumelele Lucwaba
Researcher
Cally Ardington
Researcher
Soheil Afshar
Researcher
Mark Antoniou
Researcher
Saeed Afshar
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Sergio Chevtchenko, Nikhil Navas, Rafaella Vale, Franco Ubaudi, Sipumelele Lucwaba, Cally Ardington, Soheil Afshar, Mark Antoniou, Saeed Afshar
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai