Whodunit: Classifying Code as Human Authored or GPT-4 Generated -- A case study on CodeChef problems
Project Overview
The document examines the integration of generative AI, specifically tools like GPT-4 and GitHub Copilot, into programming education while addressing significant concerns regarding academic integrity, as students might submit AI-generated code as original work. To tackle this issue, researchers developed a classifier utilizing code stylometry and machine learning techniques to differentiate between human-written and AI-generated code. The classifier demonstrated impressive performance metrics, achieving an F1-score and AUC-ROC score of 0.91, highlighting its effectiveness as a resource for educators. This tool not only aids in identifying instances of potential academic dishonesty but also fosters a deeper understanding of the implications of AI in educational settings. The findings suggest that while generative AI presents challenges, it also offers opportunities for enhancing learning and assessment practices in programming education, enabling educators to better navigate the evolving landscape of AI-assisted learning.
Key Applications
Classifier for distinguishing human-authored code from GPT-4 generated code
Context: Educational context in programming courses, targeting novice programmers and educators
Implementation: Developed a classifier using supervised machine learning (XGBoost) trained on a dataset of human and AI-generated Python solutions from CodeChef
Outcomes: Classifier achieved an F1-score and AUC-ROC score of 0.91, effectively distinguishing between human and AI-generated code
Challenges: High levels of plagiarism and contract cheating in programming courses, difficulty in detecting AI-generated code due to low similarity to human-authored code
Implementation Barriers
Academic dishonesty
Concerns about students submitting AI-generated code as their own work, leading to increased plagiarism and contract cheating. Existing plagiarism detection tools may fail to detect AI-generated solutions due to their low similarity with student-authored code.
Proposed Solutions: Development of classifiers to detect AI-generated code, increasing awareness among educators about the implications of AI tools. Use of code stylometry and machine learning classifiers to effectively differentiate between human-authored and AI-generated code.
Project Team
Oseremen Joy Idialu
Researcher
Noble Saji Mathews
Researcher
Rungroj Maipradit
Researcher
Joanne M. Atlee
Researcher
Mei Nagappan
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Oseremen Joy Idialu, Noble Saji Mathews, Rungroj Maipradit, Joanne M. Atlee, Mei Nagappan
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai