Skip to main content Skip to navigation

Beyond human subjectivity and error: a novel AI grading system

Project Overview

The document explores the integration of generative AI in education, highlighting the development of an innovative AI-driven grading system known as Automatic Short Answer Grading (ASAG). This system employs a fine-tuned transformer model trained on a comprehensive dataset from diverse university courses to automate the grading of open-ended exam questions. Experimental findings indicate that ASAG surpasses human graders in consistency and accuracy, effectively minimizing human subjectivity and errors in the grading process. This advancement suggests a significant improvement in grading fairness and efficiency, presenting a promising solution to traditional assessment challenges in educational settings. Overall, the document underscores the potential of generative AI to enhance educational practices through innovative applications that streamline grading and foster a more objective evaluation process.

Key Applications

Automatic Short Answer Grading (ASAG)

Context: Higher education, specifically for grading open-ended exam questions across various disciplines.

Implementation: The ASAG system was trained on a large dataset of exam data, fine-tuning an open-source transformer model on diverse academic subjects to evaluate student answers against reference answers.

Outcomes: The model demonstrated high accuracy, with a median absolute error 44% smaller than that of human graders, indicating improved consistency and fairness in grading.

Challenges: The model's performance may decrease with higher complexity questions and requires further exploration of explainability and robustness against unusual inputs.

Implementation Barriers

Technical Barrier

The ASAG system may struggle with questions of higher complexity and requires further fine-tuning to improve performance in such cases.

Proposed Solutions: Enriching the training dataset with a larger variety of high-complexity questions, extending training to cover corner cases and unusual inputs.

Regulatory Barrier

Grading is a high-risk domain, with new regulations emerging regarding the use of AI in educational settings.

Proposed Solutions: Following a phased approach to implement AI in grading, starting with assisted grading that allows human oversight and review of AI-generated grades.

Project Team

Alexandra Gobrecht

Researcher

Felix Tuma

Researcher

Moritz Möller

Researcher

Thomas Zöller

Researcher

Mark Zakhvatkin

Researcher

Alexandra Wuttig

Researcher

Holger Sommerfeldt

Researcher

Sven Schütt

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Alexandra Gobrecht, Felix Tuma, Moritz Möller, Thomas Zöller, Mark Zakhvatkin, Alexandra Wuttig, Holger Sommerfeldt, Sven Schütt

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies