Skip to main content Skip to navigation

Simulation as Reality? The Effectiveness of LLM-Generated Data in Open-ended Question Assessment

Project Overview

The document explores the transformative role of generative AI, especially Large Language Models (LLMs), in education, particularly in the realm of assessment through automated systems designed for open-ended questions. It highlights the innovative use of synthetic data produced by LLMs to train AI tools, demonstrating encouraging outcomes in controlled settings. However, the findings reveal substantial performance discrepancies when these tools are deployed in real-world educational contexts, underscoring the necessity for a hybrid approach. This approach should integrate both synthetic and real-world data to enhance the effectiveness of AI applications in education. The document ultimately emphasizes the potential of generative AI to improve educational assessment while acknowledging the challenges that must be addressed to realize its full benefits.

Key Applications

DeBERTa-based automated assessment agent using synthetic data for open-ended questions

Context: Higher education, specifically for university students answering open-ended scientific exam questions

Implementation: The assessment agent was trained on synthetic data generated from various sources and evaluated in both controlled and real-world settings.

Outcomes: The agent showed improved performance over state-of-the-art models (GPT-4o) in controlled environments and maintained effectiveness in real-world applications.

Challenges: Performance gaps emerged between synthetic data training and real-world applications, highlighting challenges in evaluating nuanced and subjective responses.

Implementation Barriers

Technical Barrier

The performance disparity between AI assessments in controlled environments versus real-world settings due to lack of real-world noise and biases in synthetic data.

Proposed Solutions: Incorporate a mix of synthetic and real-world data in the training process and adjust for real-world noise patterns.

Resource Barrier

The need for substantial human resources for annotating real-world data, which is time-consuming and costly.

Proposed Solutions: Utilize crowdsourcing for coding open-ended responses and leverage AI for initial assessments.

Project Team

Long Zhang

Researcher

Meng Zhang

Researcher

Wei Lin Wang

Researcher

Yu Luo

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Long Zhang, Meng Zhang, Wei Lin Wang, Yu Luo

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies