Skip to main content Skip to navigation

Mast Kalandar at SemEval-2024 Task 8: On the Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text

Project Overview

The document explores the integration of generative AI, specifically large language models (LLMs), in education, with a focus on detecting AI-generated text to maintain academic integrity and prevent misuse. It underscores the necessity of differentiating between human-written and AI-generated content, particularly in educational settings and journalism, to ensure authenticity and trustworthiness. The authors present a RoBERTa-BiLSTM classifier designed to effectively identify the source of text, achieving an impressive accuracy of 80.83%. This highlights the ongoing challenges posed by the rising prevalence of AI-generated content in various domains. The findings point to the critical need for robust detection systems to address these challenges, ensuring that the advantages of generative AI in educational contexts can be harnessed without compromising ethical standards or the quality of academic work. Overall, the document addresses the dual role of generative AI as both a tool for innovation in education and a source of potential challenges that must be navigated carefully.

Key Applications

RoBERTa-BiLSTM classifier for detecting AI-generated text

Context: Educational and academic contexts, targeting educators, students, and content creators.

Implementation: Developed a hybrid classifier combining RoBERTa and BiLSTM to classify text as AI-generated or human-generated.

Outcomes: Achieved an accuracy of 80.83, ranked 46th out of 125 in a competition, providing effective detection of machine-generated text.

Challenges: Struggled with distinguishing sentences generated by AI models trained on similar linguistic patterns.

Implementation Barriers

Technical Barrier

Difficulty in distinguishing between AI-generated text and human-generated text due to similar linguistic patterns in certain models.

Proposed Solutions: Improving model training by incorporating diverse datasets and enhancing feature extraction techniques.

Project Team

Jainit Sushil Bafna

Researcher

Hardik Mittal

Researcher

Suyash Sethia

Researcher

Manish Shrivastava

Researcher

Radhika Mamidi

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Jainit Sushil Bafna, Hardik Mittal, Suyash Sethia, Manish Shrivastava, Radhika Mamidi

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies