Beyond human subjectivity and error: a novel AI grading system
Project Overview
The document explores the integration of generative AI in education, highlighting the development of an innovative AI-driven grading system known as Automatic Short Answer Grading (ASAG). This system employs a fine-tuned transformer model trained on a comprehensive dataset from diverse university courses to automate the grading of open-ended exam questions. Experimental findings indicate that ASAG surpasses human graders in consistency and accuracy, effectively minimizing human subjectivity and errors in the grading process. This advancement suggests a significant improvement in grading fairness and efficiency, presenting a promising solution to traditional assessment challenges in educational settings. Overall, the document underscores the potential of generative AI to enhance educational practices through innovative applications that streamline grading and foster a more objective evaluation process.
Key Applications
Automatic Short Answer Grading (ASAG)
Context: Higher education, specifically for grading open-ended exam questions across various disciplines.
Implementation: The ASAG system was trained on a large dataset of exam data, fine-tuning an open-source transformer model on diverse academic subjects to evaluate student answers against reference answers.
Outcomes: The model demonstrated high accuracy, with a median absolute error 44% smaller than that of human graders, indicating improved consistency and fairness in grading.
Challenges: The model's performance may decrease with higher complexity questions and requires further exploration of explainability and robustness against unusual inputs.
Implementation Barriers
Technical Barrier
The ASAG system may struggle with questions of higher complexity and requires further fine-tuning to improve performance in such cases.
Proposed Solutions: Enriching the training dataset with a larger variety of high-complexity questions, extending training to cover corner cases and unusual inputs.
Regulatory Barrier
Grading is a high-risk domain, with new regulations emerging regarding the use of AI in educational settings.
Proposed Solutions: Following a phased approach to implement AI in grading, starting with assisted grading that allows human oversight and review of AI-generated grades.
Project Team
Alexandra Gobrecht
Researcher
Felix Tuma
Researcher
Moritz Möller
Researcher
Thomas Zöller
Researcher
Mark Zakhvatkin
Researcher
Alexandra Wuttig
Researcher
Holger Sommerfeldt
Researcher
Sven Schütt
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Alexandra Gobrecht, Felix Tuma, Moritz Möller, Thomas Zöller, Mark Zakhvatkin, Alexandra Wuttig, Holger Sommerfeldt, Sven Schütt
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai