Skip to main content Skip to navigation

Augmenting Captions with Emotional Cues: An AR Interface for Real-Time Accessible Communication

Project Overview

The document explores the transformative role of generative AI in education, emphasizing various applications that enhance learning experiences, particularly for marginalized groups. One key example is an augmented reality (AR) captioning framework designed to support Deaf and Hard of Hearing (DHH) learners in STEM fields. This innovative system enriches live transcriptions by incorporating emotional cues, which provide valuable context through non-verbal signals, thereby improving comprehension and reducing cognitive load. By leveraging generative AI, the AR framework not only facilitates better understanding of complex subjects but also fosters greater engagement and inclusivity in educational settings. The findings suggest that such advanced technologies significantly enhance accessibility and learning outcomes, demonstrating the potential of generative AI to transform educational practices and support diverse learner needs effectively.

Key Applications

AR captioning framework that integrates emotional and multimodal cues into live transcriptions.

Context: STEM classrooms for Deaf and Hard of Hearing (DHH) learners.

Implementation: Developed using Unity and deployed through an AR headset, the system captures vocal and visual inputs to generate enriched captions in real-time.

Outcomes: Significantly enhances comprehension and reduces cognitive effort compared to standard captions.

Challenges: Participants reported issues with attention management and personalization, indicating the need for adaptive systems.

Implementation Barriers

User Experience Barrier

Participants faced difficulty maintaining attention between multiple screen elements, especially when captions occluded the speaker’s face.

Proposed Solutions: The AR system aims to spatially embed captions to reduce visual fragmentation and improve context alignment.

Cognitive Load Barrier

Heavy visual embellishments in captions can be distracting, particularly for users with ADHD or sensory sensitivities.

Proposed Solutions: Flexible captioning systems that allow personalization and adjustable visual density are suggested to accommodate diverse user needs.

Project Team

Sunday David Ubur

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Sunday David Ubur

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies