From Correctness to Comprehension: AI Agents for Personalized Error Diagnosis in Education
Project Overview
The document examines the role of generative AI in education, specifically highlighting the limitations of current Large Language Models (LLMs) in providing personalized education, particularly in areas such as error diagnosis and feedback generation. It introduces MathCCS, a benchmark designed to improve systematic error analysis and personalized feedback, which incorporates real-world problems, expert annotations, and longitudinal data from students. This framework aims to enhance educational tools by integrating historical data with real-time insights, thereby improving error classification and feedback generation. The ultimate goal is to bridge the gap between the capabilities of AI technologies and the specific needs of the educational sector, ensuring that generative AI can more effectively support personalized learning experiences for students.
Key Applications
Error Analysis and Feedback Generation Framework
Context: Personalized education for elementary-grade students, focusing on error analysis and real-time feedback generation in educational settings.
Implementation: Developed a multi-modal benchmark for error analysis that integrates real-world problems, expert annotations, and longitudinal data. Combined a Time Series Agent for historical analysis with an MLLM Agent for real-time feedback refinement, enhancing the contextuality of suggestions.
Outcomes: Improved error classification and personalized feedback generation; increased accuracy in error classification and context-aware suggestions; highlighted significant gaps in current AI models compared to human educators.
Challenges: Current models achieve low classification accuracy (<30%) and provide poor suggestions (average scores <4/10). The integration of multi-agent systems is complex, and achieving coherent predictions remains a challenge.
Implementation Barriers
Technical Barrier
Current AI models focus too heavily on binary correctness, limiting their ability to diagnose student errors.
Proposed Solutions: Develop more nuanced models that can analyze error patterns and provide constructive feedback.
Data Barrier
Limited use of historical context in educational tools, leading to insufficient understanding of learning patterns.
Proposed Solutions: Utilize longitudinal data to create user profiles and improve feedback relevance.
Quality Barrier
Inadequate support for open-ended reasoning in assessments, leading to misclassification of student errors.
Proposed Solutions: Implement frameworks that can evaluate complex reasoning without oversimplifying responses.
Project Team
Yi-Fan Zhang
Researcher
Hang Li
Researcher
Dingjie Song
Researcher
Lichao Sun
Researcher
Tianlong Xu
Researcher
Qingsong Wen
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Yi-Fan Zhang, Hang Li, Dingjie Song, Lichao Sun, Tianlong Xu, Qingsong Wen
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai