Skip to main content Skip to navigation

Generative AI Framework for 3D Object Generation in Augmented Reality

Project Overview

The document explores the implementation of generative AI in education, focusing on its application within augmented reality (AR) systems to create interactive 3D models that enhance learning experiences. By leveraging advanced AI technologies, including Vision Language Models (VLMs) and Large Language Models (LLMs), the framework enables real-time conversion of diverse inputs such as images and speech into 3D objects, making complex subjects more accessible and engaging for learners. Key achievements of this approach include significantly reduced generation times, improved accuracy in object detection and model generation, and the ability to handle multilingual inputs effectively. The findings underscore the positive impacts on user engagement and system usability, while also highlighting the critical need for addressing usability challenges and ensuring context-awareness in AR environments. Furthermore, the document emphasizes the importance of continuous user feedback for enhancing these educational tools, ultimately aiming to democratize 3D model creation and improve educational outcomes across various fields, including gaming, retail, and interior design.

Key Applications

AR Systems for Interactive 3D Model Generation and Visualization

Context: Applicable in educational settings for high school students and aspiring entrepreneurs, particularly in fields like Controlled Environmental Agriculture, as well as in gaming, retail, and interior design. These systems allow students to explore 3D representations of historical artifacts and scientific phenomena interactively.

Implementation: Integrates generative AI with augmented reality technology to create and visualize interactive 3D models in real-time. This approach utilizes large language models (LLMs) and vision-language models (VLMs) for processing speech and image inputs, facilitating personalized and engaging learning experiences.

Outcomes: Enhances user engagement and understanding of complex concepts through visualization, improves comprehension and retention of educational material, increases accessibility to 3D modeling, and reduces generation time to under 50 seconds.

Challenges: Technical barriers such as latency issues, handling multilingual inputs, ensuring consistency in object detection, usability issues for infrequent users, and the need for tailored onboarding experiences.

Implementation Barriers

Technical Barrier

Latency and computational demands of generating high-quality 3D models in real-time, as well as high computational demands of GPU-intensive tasks hindering the deployment of AR systems.

Proposed Solutions: Utilizing optimized models and pre-generated object repositories to manage GPU load, reduce real-time generation demands, and improve responsiveness.

Usability Barrier

Inconsistencies in model outputs due to stochastic elements leading to user trust issues, and users may find the system complex and may require technical support to use effectively.

Proposed Solutions: Implementing a pre-generated objects repository to ensure reliable output options, along with enhanced onboarding processes and simplifying user interfaces.

Multilingual Barrier

Current generative AI solutions primarily support English, limiting accessibility for non-English speakers.

Proposed Solutions: Incorporating AI-driven translation layers for speech-to-text and text-to-speech to broaden usability.

Performance Barrier

Latency issues with large 3D model outputs can disrupt user experiences in AR applications.

Proposed Solutions: Optimizing model output sizes to reduce loading times while maintaining visual fidelity.

Project Team

Majid Behravan

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Majid Behravan

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies