Transcending Dimensions using Generative AI: Real-Time 3D Model Generation in Augmented Reality
Project Overview
The document examines the innovative incorporation of generative AI, particularly in conjunction with augmented reality (AR), to facilitate real-time 3D model generation, thereby making 3D modeling more accessible to users lacking technical skills. Utilizing models such as Shap-E, it enables the transformation of 2D images into immersive 3D representations, which holds significant potential for applications across various sectors including gaming, education, and retail. The research identifies challenges related to object isolation and complex backgrounds that can hinder the effectiveness of these technologies. However, it also highlights a marked improvement in user experience and engagement as a result of these advancements. Furthermore, the study underscores the necessity for further development in detection algorithms and user interface design to ensure broader adoption and functionality of these AI-driven tools in educational contexts. Overall, the findings suggest that generative AI, when effectively integrated with AR, can significantly enhance interactive learning experiences and provide valuable resources for educators and students alike.
Key Applications
Real-time 3D model generation in augmented reality
Context: Applications in gaming, education, and AR-based e-commerce. Target audience includes designers, educators, and gamers.
Implementation: Combining generative AI with augmented reality systems to allow users to create, manipulate, and interact with 3D models using tools like Shap-E.
Outcomes: Increased accessibility to 3D modeling, fostering creativity and innovation among users without specialized skills. Users reported high usability scores (SUS score of 69.64).
Challenges: Issues with accurately converting images with complex backgrounds and lighting conditions. Difficulty in isolating objects displayed on screens.
Implementation Barriers
Technical Barrier
Current AI models struggle with complex backgrounds and multiple objects, leading to inaccurate 3D conversions.
Proposed Solutions: Enhancements in object detection algorithms and image processing techniques to better handle complex scenes.
User Experience Barrier
Users may find certain aspects of the system complex, leading to lower usability scores for infrequent users.
Proposed Solutions: Simplifying user interfaces and providing additional support features.
Project Team
Majid Behravan
Researcher
Maryam Haghani
Researcher
Denis Gracanin
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Majid Behravan, Maryam Haghani, Denis Gracanin
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai