LLaVA-Docent: Instruction Tuning with Multimodal Large Language Model to Support Art Appreciation Education
Project Overview
The document explores the innovative use of generative AI in education, specifically through the development of LLaVA-Docent, a multimodal large language model (MLLM) aimed at enriching art appreciation education. It addresses significant challenges within the field, including outdated teaching methods and the absence of personalized feedback for students. LLaVA-Docent is designed to offer tailored interactions, scaffolding, and real-time feedback to enhance student engagement and understanding of art. The findings underscore the effectiveness of integrating constructivist pedagogy with advanced AI technologies, demonstrating that such approaches can significantly improve accessibility and educational outcomes in art education, particularly for K-12 learners. The research suggests that by utilizing generative AI, educators can create more dynamic and responsive learning environments that cater to individual student needs, ultimately fostering a deeper appreciation and understanding of art.
Key Applications
LLaVA-Docent, a multimodal large language model for art appreciation
Context: K-12 art education, particularly for novice or inexperienced art viewers in classrooms and museums.
Implementation: The model was developed using design and development research methodology, incorporating iterative feedback and expert consultation to create a data design framework for art appreciation education.
Outcomes: Enhanced engagement and personalized learning experiences in art appreciation, enabling students to connect artworks to personal experiences and fostering critical thinking.
Challenges: Limited availability of effective AI tools for art appreciation, the need for age-appropriate content, and potential cognitive overload for students.
Implementation Barriers
Technological Limitations
Conventional large language models lack visual modality and specific educational adaptations, making them less effective for tasks such as art appreciation. There are challenges in integrating AI tools into existing curricula and ensuring effective interaction in real-world educational settings.
Proposed Solutions: Utilizing multimodal large language models (MLLMs) like LLaVA that integrate visual processing capabilities alongside text. Conducting further studies and pilot tests in K-12 environments to validate the effectiveness of the AI models in practice.
Privacy Concerns
Using closed models often requires sharing learner data with tech companies, raising potential privacy issues.
Proposed Solutions: Adopting open-source models that allow customization and on-device deployment to maintain privacy.
Project Team
Unggi Lee
Researcher
Minji Jeon
Researcher
Yunseo Lee
Researcher
Gyuri Byun
Researcher
Yoorim Son
Researcher
Jaeyoon Shin
Researcher
Hongkyu Ko
Researcher
Hyeoncheol Kim
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Unggi Lee, Minji Jeon, Yunseo Lee, Gyuri Byun, Yoorim Son, Jaeyoon Shin, Hongkyu Ko, Hyeoncheol Kim
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai