Skip to main content Skip to navigation

From Correctness to Comprehension: AI Agents for Personalized Error Diagnosis in Education

Project Overview

The document examines the role of generative AI in education, specifically highlighting the limitations of current Large Language Models (LLMs) in providing personalized education, particularly in areas such as error diagnosis and feedback generation. It introduces MathCCS, a benchmark designed to improve systematic error analysis and personalized feedback, which incorporates real-world problems, expert annotations, and longitudinal data from students. This framework aims to enhance educational tools by integrating historical data with real-time insights, thereby improving error classification and feedback generation. The ultimate goal is to bridge the gap between the capabilities of AI technologies and the specific needs of the educational sector, ensuring that generative AI can more effectively support personalized learning experiences for students.

Key Applications

Error Analysis and Feedback Generation Framework

Context: Personalized education for elementary-grade students, focusing on error analysis and real-time feedback generation in educational settings.

Implementation: Developed a multi-modal benchmark for error analysis that integrates real-world problems, expert annotations, and longitudinal data. Combined a Time Series Agent for historical analysis with an MLLM Agent for real-time feedback refinement, enhancing the contextuality of suggestions.

Outcomes: Improved error classification and personalized feedback generation; increased accuracy in error classification and context-aware suggestions; highlighted significant gaps in current AI models compared to human educators.

Challenges: Current models achieve low classification accuracy (<30%) and provide poor suggestions (average scores <4/10). The integration of multi-agent systems is complex, and achieving coherent predictions remains a challenge.

Implementation Barriers

Technical Barrier

Current AI models focus too heavily on binary correctness, limiting their ability to diagnose student errors.

Proposed Solutions: Develop more nuanced models that can analyze error patterns and provide constructive feedback.

Data Barrier

Limited use of historical context in educational tools, leading to insufficient understanding of learning patterns.

Proposed Solutions: Utilize longitudinal data to create user profiles and improve feedback relevance.

Quality Barrier

Inadequate support for open-ended reasoning in assessments, leading to misclassification of student errors.

Proposed Solutions: Implement frameworks that can evaluate complex reasoning without oversimplifying responses.

Project Team

Yi-Fan Zhang

Researcher

Hang Li

Researcher

Dingjie Song

Researcher

Lichao Sun

Researcher

Tianlong Xu

Researcher

Qingsong Wen

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Yi-Fan Zhang, Hang Li, Dingjie Song, Lichao Sun, Tianlong Xu, Qingsong Wen

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies