Assessing instructor-AI cooperation for grading essay-type questions in an introductory sociology course
Project Overview
The document explores the integration of generative AI, particularly GPT models, in higher education, with a focus on their application in grading essay-type questions in an introductory sociology course. It emphasizes the potential benefits of AI in enhancing grading efficiency and fairness, positioning it as a valuable tool to complement human evaluators rather than replace them. The research highlights how AI can assist in identifying inconsistencies in assessments, thereby supporting human grading processes. However, it also addresses critical challenges, including the presence of biases in both human and AI evaluations, underscoring the need for careful implementation. Overall, the findings suggest that generative AI can play a significant role in educational assessment, helping to streamline grading while maintaining the essential human element in education.
Key Applications
AI-assisted grading of essay-type questions
Context: Grading handwritten exams in an introductory sociology course for university students.
Implementation: Human instructors graded exams based on template answers, while GPT models were used for transcribing and grading the students' handwritten responses.
Outcomes: High similarity between human and AI transcriptions, with GPT providing strong correlations with human scores, especially when template answers were used. The AI can help reduce grading time and flag inconsistencies.
Challenges: Discrepancies between human and AI scoring, particularly when template answers are not provided. The AI tends to score more leniently, highlighting the need for careful use and human verification.
Implementation Barriers
Technical
Challenges in processing handwritten responses accurately, ensuring AI models align with grading standards, and using cost-effective models for specific tasks.
Proposed Solutions: Using template answers to guide AI scoring and employing models like GPT-4o-mini that are efficient for specific tasks.
Bias
Potential biases in human grading that AI might not eliminate, and AI's own biases in scoring.
Proposed Solutions: Combining AI assessments with human evaluations to create a more balanced and fair grading approach.
Project Team
Francisco Olivos
Researcher
Tobias Kamelski
Researcher
Sebastián Ascui-Gac
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Francisco Olivos, Tobias Kamelski, Sebastián Ascui-Gac
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai