Assessing instructor-AI cooperation for grading essay-type questions in an introductory sociology course

Project Overview

The document explores the integration of generative AI, particularly GPT models, in higher education, with a focus on their application in grading essay-type questions in an introductory sociology course. It emphasizes the potential benefits of AI in enhancing grading efficiency and fairness, positioning it as a valuable tool to complement human evaluators rather than replace them. The research highlights how AI can assist in identifying inconsistencies in assessments, thereby supporting human grading processes. However, it also addresses critical challenges, including the presence of biases in both human and AI evaluations, underscoring the need for careful implementation. Overall, the findings suggest that generative AI can play a significant role in educational assessment, helping to streamline grading while maintaining the essential human element in education.

Key Applications

AI-assisted grading of essay-type questions

Context: Grading handwritten exams in an introductory sociology course for university students.

Implementation: Human instructors graded exams based on template answers, while GPT models were used for transcribing and grading the students' handwritten responses.

Outcomes: High similarity between human and AI transcriptions, with GPT providing strong correlations with human scores, especially when template answers were used. The AI can help reduce grading time and flag inconsistencies.

Challenges: Discrepancies between human and AI scoring, particularly when template answers are not provided. The AI tends to score more leniently, highlighting the need for careful use and human verification.

Implementation Barriers

Technical

Challenges in processing handwritten responses accurately, ensuring AI models align with grading standards, and using cost-effective models for specific tasks.

Proposed Solutions: Using template answers to guide AI scoring and employing models like GPT-4o-mini that are efficient for specific tasks.

Bias

Potential biases in human grading that AI might not eliminate, and AI's own biases in scoring.

Proposed Solutions: Combining AI assessments with human evaluations to create a more balanced and fair grading approach.

Project Team

Francisco Olivos

Researcher

Tobias Kamelski

Researcher

Sebastián Ascui-Gac

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Francisco Olivos, Tobias Kamelski, Sebastián Ascui-Gac

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

← Back to Projects