Skip to main content Skip to navigation

FT-Boosted SV: Towards Noise Robust Speaker Verification for English Speaking Classroom Environments

Project Overview

The document explores the application of generative AI in education, particularly through the development of speaker verification (SV) systems tailored for classroom environments to address the unique challenge of children's speech. It underscores the significance of adapting AI technologies to the complexities of educational settings, such as background noise and the distinct characteristics of children's voices. By fine-tuning existing SV models with augmented datasets, the study reveals substantial improvements in performance, showcasing significant reductions in error rates in noisy classroom situations. The research highlights the potential of generative AI techniques to enhance the understanding and processing of children's speech, ultimately improving the effectiveness of AI tools in real-world educational contexts. Overall, the findings suggest that properly adapted generative AI can play a crucial role in fostering better communication and learning outcomes in classrooms.

Key Applications

Speaker verification systems using x-vector and ECAPA-TDNN models

Context: Educational settings, particularly in classrooms with children and teachers

Implementation: Fine-tuned pretrained models using augmented datasets of children's speech to improve robustness against classroom noise.

Outcomes: Achieved significant reductions in Equal Error Rate (EER) for both children's speech and classroom environments, improving the accuracy of speaker verification.

Challenges: Classroom noise (babble) and the distinct acoustic properties of children's speech compared to adults.

Implementation Barriers

Technical barrier

Limited availability of children's speech datasets for training speaker verification systems, compounded by classroom noise (babble noise) that complicates the speaker verification process.

Proposed Solutions: Utilizing data augmentation techniques, combining multiple children's datasets to create a more robust training set, and incorporating background noise and reverberation effects during the training phase to improve model robustness.

Project Team

Saba Tabatabaee

Researcher

Jing Liu

Researcher

Carol Espy-Wilson

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Saba Tabatabaee, Jing Liu, Carol Espy-Wilson

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies