Seeing the Forest and the Trees: Solving Visual Graph and Tree Based Data Structure Problems using Large Multimodal Models
Project Overview
The document explores the use of generative AI, specifically large multimodal models (LMMs), in the field of computing education, emphasizing their ability to tackle complex visual problems associated with graph and tree data structures. It identifies the promising potential of LMMs to enhance student learning experiences, while also addressing significant concerns regarding academic integrity and the effectiveness of current assessment practices. The findings reveal that LMMs demonstrate proficiency in handling tree-related tasks, yet they face difficulties in accurately solving graph problems, which calls for a reconsideration of conventional evaluation methods in educational settings. Overall, the document highlights both the transformative potential of generative AI in education and the challenges that must be addressed to ensure its effective integration.
Key Applications
Large Multimodal Models (LMMs) for solving graph and tree data structure problems.
Context: Computing education, specifically in courses covering data structures and algorithms.
Implementation: A benchmark dataset of 9,072 graph and tree data structure tasks was created to evaluate the performance of various LMMs, including GPT-4o and Gemini models.
Outcomes: LMMs demonstrated varying levels of success in solving visual problems with GPT-4o achieving 87.6% accuracy on tree samples and Gemini 1.5 Flash achieving 56.2% accuracy on graph samples.
Challenges: Challenges include the models' varying performance based on structural complexity and aesthetic features, as well as rising concerns over academic dishonesty and the integrity of assessments.
Implementation Barriers
Ethical
Concerns about academic integrity and the potential misuse of AI tools by students, as well as challenges in ensuring that assessments are not compromised by AI capabilities.
Proposed Solutions: Proposed solutions include implementing human-proctored exams, visual-based questions that challenge the capabilities of LLMs, and developing guidelines for ethical use of AI in education.
Technical
The challenges of data leakage during model evaluation, as models may have prior exposure to problems, which affects the reliability of assessments.
Proposed Solutions: Creating unique benchmark datasets that do not exist online to ensure reliability in assessments, along with robust evaluation protocols to mitigate data leakage.
Pedagogical
The inadequacy of traditional assessment methods in evaluating student understanding in the context of advanced AI capabilities, necessitating a shift in assessment practices.
Proposed Solutions: Adopting innovative assessment models like ungrading or project-based learning to promote deeper engagement and better reflect student understanding.
Project Team
Sebastian Gutierrez
Researcher
Irene Hou
Researcher
Jihye Lee
Researcher
Kenneth Angelikas
Researcher
Owen Man
Researcher
Sophia Mettille
Researcher
James Prather
Researcher
Paul Denny
Researcher
Stephen MacNeil
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Sebastian Gutierrez, Irene Hou, Jihye Lee, Kenneth Angelikas, Owen Man, Sophia Mettille, James Prather, Paul Denny, Stephen MacNeil
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai