More Robots are Coming: Large Multimodal Models (ChatGPT) can Solve Visually Diverse Images of Parsons Problems
Project Overview
Generative AI, especially through large multimodal models like GPT-4V and Bard, is transforming the landscape of education, particularly in computing. These models excel at solving complex visual programming tasks, such as Parsons problems, with GPT-4V achieving an impressive success rate of 96.7%, while Bard follows with 69.2%. This advancement highlights both the potential of AI as a powerful educational tool and the challenges it poses to traditional assessment methods. As educators grapple with the implications of AI-assisted learning, they face the need to rethink their approaches to academic integrity and student evaluation. The integration of generative AI in educational settings prompts a critical examination of how assessments can adapt to ensure that they accurately reflect student learning in an era where AI can both aid and complicate the educational process.
Key Applications
Large Multimodal Models (GPT-4V and Bard)
Context: Computing education, focusing on introductory programming courses.
Implementation: The models were tested on solving Parsons problems presented in various visual formats.
Outcomes: GPT-4V solved 96.7% of the problems, while Bard solved 69.2%. The study highlighted the capabilities of these models in analyzing and solving visual programming tasks.
Challenges: Bard struggled with visual extraction and had a refusal rate of 7.5%, while both models faced issues with hallucinations and generating incorrect code blocks.
Implementation Barriers
Technical and Ethical Barrier
The emergence of new multimodal models challenges traditional assessment methods, as they can solve visual programming problems. There are also concerns over academic integrity, as students may misuse AI tools to complete assignments.
Proposed Solutions: Adoption of adversarial methods in assessments, rethinking evaluation strategies in light of AI capabilities, and implementing more complex visual problems to mitigate the risk of cheating while exploring alternative assessment methods.
Project Team
Irene Hou
Researcher
Owen Man
Researcher
Sophie Mettille
Researcher
Sebastian Gutierrez
Researcher
Kenneth Angelikas
Researcher
Stephen MacNeil
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Irene Hou, Owen Man, Sophie Mettille, Sebastian Gutierrez, Kenneth Angelikas, Stephen MacNeil
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai