Is this Snippet Written by ChatGPT? An Empirical Study with a CodeBERT-Based Classifier

Project Overview

The document explores the integration of generative AI, specifically ChatGPT, in software engineering education, focusing on an empirical study that evaluates the effectiveness of a tool named GPTSniffer for identifying AI-generated code. It underscores the advantages of using AI in educational settings, including its support for students in coding tasks, while also addressing significant ethical concerns such as plagiarism and the impact on skill development. The study reveals that GPTSniffer surpasses existing classifiers like GPTZero and OpenAI Text Classifier in accurately distinguishing between human-written and AI-generated code. Nonetheless, it highlights ongoing challenges regarding the ethical deployment of AI in educational frameworks and the necessity for specialized classifiers to detect AI-generated content effectively. Overall, the findings suggest that while generative AI holds great promise for enhancing learning experiences, careful consideration of its implications is crucial for fostering an ethical and effective educational environment.

Key Applications

GPTSniffer

Context: Educational context focusing on software engineering students.

Implementation: The tool was implemented using CodeBERT for classifying code snippets as human-written or AI-generated.

Outcomes: GPTSniffer achieved high accuracy in distinguishing between human and AI-generated code, particularly when trained on paired snippets.

Challenges: The model struggles with code from completely different domains than those it was trained on, and ethical concerns regarding AI use in education persist.

Implementation Barriers

Ethical

Concerns about students using AI-generated code to complete assignments without understanding the material.

Proposed Solutions: Some universities have regulated or banned the use of AI tools like ChatGPT to maintain academic integrity.

Technical

Existing detection tools, such as GPTZero, are not effective for distinguishing between human-written and AI-generated code. This necessitates the development of specific classifiers that are fine-tuned for code detection.

Proposed Solutions: Developing specific classifiers, like GPTSniffer, that are fine-tuned for code detection.

Project Team

Phuong T. Nguyen

Researcher

Juri Di Rocco

Researcher

Claudio Di Sipio

Researcher

Riccardo Rubei

Researcher

Davide Di Ruscio

Researcher

Massimiliano Di Penta

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Phuong T. Nguyen, Juri Di Rocco, Claudio Di Sipio, Riccardo Rubei, Davide Di Ruscio, Massimiliano Di Penta

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

← Back to Projects