Skip to main content Skip to navigation

Assessing Confidence in AI-Assisted Grading of Physics Exams through Psychometrics: An Exploratory Study

Project Overview

The document explores the integration of generative AI in education, with a specific focus on its application in grading at ETH Zurich. By utilizing psychometric methods like Item Response Theory (IRT), the study aims to enhance grading rubrics for AI-assisted evaluations. The findings indicate that AI can significantly alleviate grading workloads while achieving a high correlation with human assessors' results, suggesting its potential to maintain accuracy in assessments. However, the document highlights the importance of human oversight, particularly in complex grading situations where nuanced student responses may pose challenges for AI reliability. Overall, while generative AI presents promising advancements in educational grading practices, careful implementation and monitoring are essential to ensure fairness and precision in assessment outcomes.

Key Applications

AI-assisted grading using IRT and GPT-4 models

Context: High-stakes physics exams for engineering students at ETH Zurich

Implementation: The AI graded student solutions based on prompts derived from a detailed rubric, with iterative refinement based on statistical evaluations of grading accuracy.

Outcomes: High correlation between AI and human grading results (R2≈0.91), significant reduction in grading workload, and ability to maintain grading accuracy.

Challenges: AI reliability in complex problem solving and dependence on human oversight for cases of uncertainty.

Implementation Barriers

Technical Barrier

Inadequate determination of confidence levels in AI grading systems, leading to risks in grading accuracy and reliability.

Proposed Solutions: The implementation of human oversight and iterative refinement of grading rubrics using psychometric methods like IRT.

Operational Barrier

Self-selection bias in student participation could affect the generalizability of the findings.

Proposed Solutions: Ensuring a more representative sample of students in future studies.

Project Team

Gerd Kortemeyer

Researcher

Julian Nöhl

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Gerd Kortemeyer, Julian Nöhl

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies