Gemini Pro Defeated by GPT-4V: Evidence from Education
Project Overview
The document explores the application of generative AI in education, particularly through the evaluation of two AI models, GPT-4V and Gemini Pro, in scoring student-drawn models within science education using visual question answering (VQA) techniques. The analysis reveals that GPT-4V surpasses Gemini Pro in both scoring accuracy and image processing capabilities, indicating its effectiveness for multimodal educational tasks. The findings underscore the potential of these AI models to enhance educational assessments by providing more accurate evaluations of student work. Additionally, the paper addresses the challenges encountered by Gemini Pro, emphasizing the need for further development in AI tools to optimize their use in educational settings. Overall, the document highlights the promising role of generative AI in improving assessment practices and supporting student learning through innovative technology.
Key Applications
Visual Question Answering (VQA) for automatic scoring
Context: Educational context focusing on science education, targeting teachers and researchers assessing student-drawn models.
Implementation: Used visual question answering techniques to automatically score student-drawn models with NERIF prompting methods.
Outcomes: GPT-4V demonstrated higher scoring accuracy compared to Gemini Pro, with an accuracy of 51% for science-based assessments.
Challenges: Gemini Pro struggled with fine-grained text recognition and often misclassified inputs as scientific posters.
Implementation Barriers
Technical Barrier
Limited access to essential technologies and resources for educational scholars to utilize VQA techniques effectively.
Proposed Solutions: Propose making VQA techniques accessible through user-friendly interfaces similar to LLMs.
Project Team
Gyeong-Geon Lee
Researcher
Ehsan Latif
Researcher
Lehong Shi
Researcher
Xiaoming Zhai
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Gyeong-Geon Lee, Ehsan Latif, Lehong Shi, Xiaoming Zhai
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai