Skip to main content Skip to navigation

uTalk: Bridging the Gap Between Humans and AI

Project Overview

The document explores the innovative application of generative AI in education through the uTalk system, which combines Large Language Models (LLMs) like ChatGPT with advanced visual models to improve Human-Computer Interaction (HCI). This interactive framework enables users to converse with digital avatars, facilitating content creation and enhancing engagement in educational settings. By incorporating technologies such as Whisper for speech recognition and SadTalker for producing realistic video avatars, uTalk aims to create a seamless and intuitive user experience across various educational platforms, chatbots, and personal assistants. The key findings highlight the system's potential to optimize runtime efficiency while fostering a more interactive and personalized learning environment. Overall, the integration of generative AI through uTalk demonstrates significant advancements in how technology can enrich educational experiences, making learning more accessible and engaging for users.

Key Applications

uTalk - an interactive virtual avatar system for conversation and content generation.

Context: Educational platforms, chatbots, and personal assistants.

Implementation: The system integrates Whisper API for speech recognition, ChatGPT for generating responses, and SadTalker for producing talking head videos. It is hosted on Streamlit, allowing user interaction through audio or text inputs.

Outcomes: Enhanced user experience with realistic digital twins, improved run-time performance by 27.69%, and ability to generate videos for educational content.

Challenges: Technical bottlenecks in video generation and optimization of run-time.

Implementation Barriers

Technical barrier

Bottlenecks in the performance of the SadTalker system, which affects video generation speed.

Proposed Solutions: Optimization of code, removal of redundant processes, and parallelization of operations within the framework.

Project Team

Hussam Azzuni

Researcher

Sharim Jamal

Researcher

Abdulmotaleb Elsaddik

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Hussam Azzuni, Sharim Jamal, Abdulmotaleb Elsaddik

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies