Skip to main content Skip to navigation

An End-to-End Approach for Child Reading Assessment in the Xhosa Language

Project Overview

The document explores the innovative use of generative AI in the field of education, specifically through the development of an end-to-end approach for assessing reading abilities in children speaking low-resource languages like Xhosa. It underscores the critical role of literacy in a child's future success and highlights how AI can enhance and automate reading assessments. Central to this research is the application of automatic speech recognition (ASR) technology to detect pronunciation challenges faced by young readers. The study introduces a unique dataset comprising Xhosa child speech samples and rigorously evaluates the performance of advanced ASR models. Despite the promising potential of these technologies, the findings point to significant hurdles, particularly concerning data scarcity and the distinct acoustic characteristics of children's voices. Ultimately, the document illustrates the transformative potential of generative AI in education while acknowledging the complexities involved in implementing these technologies effectively in low-resource contexts.

Key Applications

Automatic Speech Recognition (ASR) for reading assessment

Context: Assessment of children's reading skills in low-resource languages (Xhosa) for early grades in schools in South Africa.

Implementation: Developed a dataset of 14,971 recordings of children pronouncing Xhosa words, labeled by multiple markers and validated by an expert.

Outcomes: Improved assessment accuracy and tracking of children's reading progress; automation reduces human error and effort.

Challenges: Limited training data for low-resource languages; performance of ASR models can be affected by noise and pronunciation variability.

Implementation Barriers

Data Quality

Limited availability of training data for low-resource languages impacts the effectiveness of ASR models, while challenges in capturing high-quality recordings due to classroom noise and variability in children's speech hinder data collection.

Proposed Solutions: Utilization of self-supervised learning models to reduce reliance on large labeled datasets, along with implementation of consensus-based labeling to improve data quality and validation methods.

Project Team

Sergio Chevtchenko

Researcher

Nikhil Navas

Researcher

Rafaella Vale

Researcher

Franco Ubaudi

Researcher

Sipumelele Lucwaba

Researcher

Cally Ardington

Researcher

Soheil Afshar

Researcher

Mark Antoniou

Researcher

Saeed Afshar

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Sergio Chevtchenko, Nikhil Navas, Rafaella Vale, Franco Ubaudi, Sipumelele Lucwaba, Cally Ardington, Soheil Afshar, Mark Antoniou, Saeed Afshar

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies