Transcending Dimensions using Generative AI: Real-Time 3D Model Generation in Augmented Reality

Project Overview

The document examines the innovative incorporation of generative AI, particularly in conjunction with augmented reality (AR), to facilitate real-time 3D model generation, thereby making 3D modeling more accessible to users lacking technical skills. Utilizing models such as Shap-E, it enables the transformation of 2D images into immersive 3D representations, which holds significant potential for applications across various sectors including gaming, education, and retail. The research identifies challenges related to object isolation and complex backgrounds that can hinder the effectiveness of these technologies. However, it also highlights a marked improvement in user experience and engagement as a result of these advancements. Furthermore, the study underscores the necessity for further development in detection algorithms and user interface design to ensure broader adoption and functionality of these AI-driven tools in educational contexts. Overall, the findings suggest that generative AI, when effectively integrated with AR, can significantly enhance interactive learning experiences and provide valuable resources for educators and students alike.

Key Applications

Real-time 3D model generation in augmented reality

Context: Applications in gaming, education, and AR-based e-commerce. Target audience includes designers, educators, and gamers.

Implementation: Combining generative AI with augmented reality systems to allow users to create, manipulate, and interact with 3D models using tools like Shap-E.

Outcomes: Increased accessibility to 3D modeling, fostering creativity and innovation among users without specialized skills. Users reported high usability scores (SUS score of 69.64).

Challenges: Issues with accurately converting images with complex backgrounds and lighting conditions. Difficulty in isolating objects displayed on screens.

Implementation Barriers

Technical Barrier

Current AI models struggle with complex backgrounds and multiple objects, leading to inaccurate 3D conversions.

Proposed Solutions: Enhancements in object detection algorithms and image processing techniques to better handle complex scenes.

User Experience Barrier

Users may find certain aspects of the system complex, leading to lower usability scores for infrequent users.

Proposed Solutions: Simplifying user interfaces and providing additional support features.

Project Team

Majid Behravan

Researcher

Maryam Haghani

Researcher

Denis Gracanin

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Majid Behravan, Maryam Haghani, Denis Gracanin

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

← Back to Projects