Skip to main content Skip to navigation

Learning Multimodal Cues of Children's Uncertainty

Project Overview

The document explores the application of generative AI in education, focusing on a study that leverages multimodal AI systems to detect uncertainty in young children. By emphasizing the significance of non-verbal cues—such as facial expressions, gestures, and auditory signals—the research aims to improve human-AI interactions within educational settings. To facilitate this, a specially annotated dataset was developed to train machine learning models capable of predicting children's uncertainty based on these multimodal inputs. The findings indicate that effectively recognizing and understanding uncertainty in children can lead to enhanced educational strategies and the design of AI systems that better assist in learning. Overall, the study underscores the potential of generative AI to support educational outcomes by fostering a deeper understanding of learners' emotional and cognitive states.

Key Applications

Multimodal machine learning model for predicting uncertainty

Context: Educational settings involving young children aged 4-5 years old, specifically during a counting game designed to assess their understanding of numerical quantities.

Implementation: The model was trained on an annotated dataset that includes video recordings of children, capturing various multimodal cues of uncertainty during task performance.

Outcomes: The model demonstrated improved prediction of children's uncertainty based on various cues, which could inform better educational interventions and human-AI collaboration.

Challenges: Challenges include the complexity of accurately detecting and interpreting subtle cues of uncertainty, as well as the variability in children's expressions of uncertainty based on age and individual differences.

Implementation Barriers

Technical barrier

High-dimensional data complexity makes it challenging to train models effectively.

Proposed Solutions: Utilization of ensemble learning approaches and contrastive learning to better manage and interpret the multimodal input.

Ethical barrier

Concerns regarding privacy and consent for using children's video data in research.

Proposed Solutions: Only anonymized data is shared publicly, and all necessary consent forms are obtained from parents before participation.

Project Team

Qi Cheng

Researcher

Mert İnan

Researcher

Rahma Mbarki

Researcher

Grace Grmek

Researcher

Theresa Choi

Researcher

Yiming Sun

Researcher

Kimele Persaud

Researcher

Jenny Wang

Researcher

Malihe Alikhani

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Qi Cheng, Mert İnan, Rahma Mbarki, Grace Grmek, Theresa Choi, Yiming Sun, Kimele Persaud, Jenny Wang, Malihe Alikhani

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies