Skip to main content Skip to navigation

Whodunit: Classifying Code as Human Authored or GPT-4 Generated -- A case study on CodeChef problems

Project Overview

The document examines the integration of generative AI, specifically tools like GPT-4 and GitHub Copilot, into programming education while addressing significant concerns regarding academic integrity, as students might submit AI-generated code as original work. To tackle this issue, researchers developed a classifier utilizing code stylometry and machine learning techniques to differentiate between human-written and AI-generated code. The classifier demonstrated impressive performance metrics, achieving an F1-score and AUC-ROC score of 0.91, highlighting its effectiveness as a resource for educators. This tool not only aids in identifying instances of potential academic dishonesty but also fosters a deeper understanding of the implications of AI in educational settings. The findings suggest that while generative AI presents challenges, it also offers opportunities for enhancing learning and assessment practices in programming education, enabling educators to better navigate the evolving landscape of AI-assisted learning.

Key Applications

Classifier for distinguishing human-authored code from GPT-4 generated code

Context: Educational context in programming courses, targeting novice programmers and educators

Implementation: Developed a classifier using supervised machine learning (XGBoost) trained on a dataset of human and AI-generated Python solutions from CodeChef

Outcomes: Classifier achieved an F1-score and AUC-ROC score of 0.91, effectively distinguishing between human and AI-generated code

Challenges: High levels of plagiarism and contract cheating in programming courses, difficulty in detecting AI-generated code due to low similarity to human-authored code

Implementation Barriers

Academic dishonesty

Concerns about students submitting AI-generated code as their own work, leading to increased plagiarism and contract cheating. Existing plagiarism detection tools may fail to detect AI-generated solutions due to their low similarity with student-authored code.

Proposed Solutions: Development of classifiers to detect AI-generated code, increasing awareness among educators about the implications of AI tools. Use of code stylometry and machine learning classifiers to effectively differentiate between human-authored and AI-generated code.

Project Team

Oseremen Joy Idialu

Researcher

Noble Saji Mathews

Researcher

Rungroj Maipradit

Researcher

Joanne M. Atlee

Researcher

Mei Nagappan

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Oseremen Joy Idialu, Noble Saji Mathews, Rungroj Maipradit, Joanne M. Atlee, Mei Nagappan

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies