Speaking images. A novel framework for the automated self-description of artworks
Project Overview
The document outlines a groundbreaking approach to utilizing generative AI in education, particularly through the automated self-description of artworks. By integrating large language models (LLMs) with computer vision, the framework enables the creation of 'speaking images' that can articulate their content, thereby enhancing accessibility to digital collections in art and cultural heritage. This innovative application not only facilitates a deeper understanding of artworks but also addresses the inherent biases and challenges that generative AI poses in educational contexts. The findings suggest that such technology can significantly enrich learning experiences, offering personalized interactions with art and fostering a more inclusive environment for diverse learners. Overall, the document highlights the transformative potential of generative AI in reshaping how educational content is delivered and understood, promoting engagement and critical thinking among students.
Key Applications
Automated self-description of artworks
Context: Art education, targeting students and art historians
Implementation: Utilizes a pipeline of machine learning models including face detection, LLMs for text generation, and text-to-speech for creating video explanations of artworks.
Outcomes: Improved understanding of artworks through animated explanations, increased engagement with digital collections.
Challenges: Cultural biases in LLMs, accuracy of descriptions, technical limitations in animation, and ethical concerns regarding content moderation.
Implementation Barriers
Cultural Bias
Generative AI models may encapsulate cultural biases from their training data, affecting the accuracy and interpretive frameworks of generated content.
Proposed Solutions: Enhancing training datasets to reflect diverse cultural perspectives and implementing guardrails to prevent biased outputs.
Technical Limitations
Challenges with face detection in artworks, especially those with non-standard orientations or styles.
Proposed Solutions: Improving algorithms for face recognition, using multiple models for better accuracy, and refining input prompts for better results.
Ethical Concerns
Issues around the use of AI in creative expressions raise questions about originality and authorship.
Proposed Solutions: Establishing guidelines for ethical use of generative AI in art and education, involving art historians in the curation process.
Project Team
Valentine Bernasconi
Researcher
Gustavo Marfia
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Valentine Bernasconi, Gustavo Marfia
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai