Augmenting Captions with Emotional Cues: An AR Interface for Real-Time Accessible Communication
Project Overview
The document explores the transformative role of generative AI in education, emphasizing various applications that enhance learning experiences, particularly for marginalized groups. One key example is an augmented reality (AR) captioning framework designed to support Deaf and Hard of Hearing (DHH) learners in STEM fields. This innovative system enriches live transcriptions by incorporating emotional cues, which provide valuable context through non-verbal signals, thereby improving comprehension and reducing cognitive load. By leveraging generative AI, the AR framework not only facilitates better understanding of complex subjects but also fosters greater engagement and inclusivity in educational settings. The findings suggest that such advanced technologies significantly enhance accessibility and learning outcomes, demonstrating the potential of generative AI to transform educational practices and support diverse learner needs effectively.
Key Applications
AR captioning framework that integrates emotional and multimodal cues into live transcriptions.
Context: STEM classrooms for Deaf and Hard of Hearing (DHH) learners.
Implementation: Developed using Unity and deployed through an AR headset, the system captures vocal and visual inputs to generate enriched captions in real-time.
Outcomes: Significantly enhances comprehension and reduces cognitive effort compared to standard captions.
Challenges: Participants reported issues with attention management and personalization, indicating the need for adaptive systems.
Implementation Barriers
User Experience Barrier
Participants faced difficulty maintaining attention between multiple screen elements, especially when captions occluded the speaker’s face.
Proposed Solutions: The AR system aims to spatially embed captions to reduce visual fragmentation and improve context alignment.
Cognitive Load Barrier
Heavy visual embellishments in captions can be distracting, particularly for users with ADHD or sensory sensitivities.
Proposed Solutions: Flexible captioning systems that allow personalization and adjustable visual density are suggested to accommodate diverse user needs.
Project Team
Sunday David Ubur
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Sunday David Ubur
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai