Skip to main content Skip to navigation

Single-Agent vs. Multi-Agent LLM Strategies for Automated Student Reflection Assessment

Project Overview

The document examines the integration of Large Language Models (LLMs) in educational contexts, particularly for automating the assessment of student reflections. It addresses the limitations of conventional reflection assessment methods and introduces a novel approach that leverages LLMs to convert qualitative reflections into quantitative scores, thereby enhancing the evaluation process. The findings indicate that LLMs are effective in predicting academic performance, notably in identifying at-risk students, which improves educational analytics. The experiments conducted reveal that the use of LLMs not only alleviates the workload of educators but also facilitates timely and targeted support for students, ultimately fostering a more efficient educational environment. Overall, the research underscores the transformative potential of generative AI in refining assessment practices and supporting student success in educational settings.

Key Applications

Automated assessment of student reflections using LLMs

Context: Higher education, specifically in an Information Science course at Kyushu University, targeting students providing reflective writing

Implementation: The study employed two assessment strategies (single-agent and multi-agent) and two prompting techniques (zero-shot and few-shot) to evaluate student reflections and predict academic performance.

Outcomes: The LLM-assisted assessment demonstrated a high match rate with human evaluations and improved identification of at-risk students and accurate grade predictions.

Challenges: Ensuring the consistency and reliability of LLM assessments compared to human evaluations was a challenge.

Implementation Barriers

Technical

The open-text nature of student reflections leads to variability in content and expression styles, making manual assessment complex.

Proposed Solutions: Using LLMs provides a scalable solution to analyze reflections; implementing robust prompting strategies enhances assessment quality.

Validation

Verifying the consistency of LLM-generated assessments compared to human evaluations is essential for effectiveness.

Proposed Solutions: Incorporating human labels to validate LLM assessments ensures accuracy and reliability.

Project Team

Gen Li

Researcher

Li Chen

Researcher

Cheng Tang

Researcher

Valdemar Švábenský

Researcher

Daisuke Deguchi

Researcher

Takayoshi Yamashita

Researcher

Atsushi Shimada

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Gen Li, Li Chen, Cheng Tang, Valdemar Švábenský, Daisuke Deguchi, Takayoshi Yamashita, Atsushi Shimada

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies