Skip to main content Skip to navigation

The Art of Storytelling: Multi-Agent Generative AI for Dynamic Multimodal Narratives

Project Overview

The document outlines an innovative educational tool that utilizes Generative Artificial Intelligence (GenAI) to enrich storytelling experiences for children. By integrating advanced technologies such as Large Language Models (LLMs), Text-to-Speech (TTS), Text-to-Video (TTV), and Text-to-Music (TTM), the tool creates engaging, multimodal narratives that cater to diverse learning styles. This system enhances children's creative expression and supports cognitive development by actively involving them in the storytelling process. The findings indicate that such dynamic storytelling not only captivates young learners but also fosters critical thinking and imaginative skills, ultimately contributing to a more interactive and effective educational environment. The implementation of GenAI in this context highlights its potential to transform traditional learning methods and enrich the educational landscape.

Key Applications

Multi-agent system for storytelling

Context: Educational tool for children aged 7 to 12, focusing on storytelling and narrative creation.

Implementation: Utilizes LLMs for story generation, TTS for narration, TTV for animation, and TTM for background music. The system involves user prompts for story development and has multiple agents for different tasks.

Outcomes: Enhances reading comprehension, supports creative expression, improves retention and engagement through multimodal stimuli.

Challenges: Visual distortions in animations, high generation time for videos, need for a more comprehensive dataset for child-appropriateness.

Implementation Barriers

Technical

Visual inconsistencies in generated animations detract from the storytelling experience.

Proposed Solutions: Develop more robust models for animation generation and optimize rendering times.

Content Appropriateness

Ensuring that generated content is suitable for children and free from inappropriate material.

Proposed Solutions: Implement rigorous content moderation using LLMs and human review to filter out inappropriate stories.

Resource Limitations

High generation time for animations can hinder the user experience.

Proposed Solutions: Optimize the rendering pipeline and consider using more efficient algorithms for video generation.

Project Team

Samee Arif

Researcher

Taimoor Arif

Researcher

Muhammad Saad Haroon

Researcher

Aamina Jamal Khan

Researcher

Agha Ali Raza

Researcher

Awais Athar

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Samee Arif, Taimoor Arif, Muhammad Saad Haroon, Aamina Jamal Khan, Agha Ali Raza, Awais Athar

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies