Single-Agent vs. Multi-Agent LLM Strategies for Automated Student Reflection Assessment
Project Overview
The document examines the integration of Large Language Models (LLMs) in educational contexts, particularly for automating the assessment of student reflections. It addresses the limitations of conventional reflection assessment methods and introduces a novel approach that leverages LLMs to convert qualitative reflections into quantitative scores, thereby enhancing the evaluation process. The findings indicate that LLMs are effective in predicting academic performance, notably in identifying at-risk students, which improves educational analytics. The experiments conducted reveal that the use of LLMs not only alleviates the workload of educators but also facilitates timely and targeted support for students, ultimately fostering a more efficient educational environment. Overall, the research underscores the transformative potential of generative AI in refining assessment practices and supporting student success in educational settings.
Key Applications
Automated assessment of student reflections using LLMs
Context: Higher education, specifically in an Information Science course at Kyushu University, targeting students providing reflective writing
Implementation: The study employed two assessment strategies (single-agent and multi-agent) and two prompting techniques (zero-shot and few-shot) to evaluate student reflections and predict academic performance.
Outcomes: The LLM-assisted assessment demonstrated a high match rate with human evaluations and improved identification of at-risk students and accurate grade predictions.
Challenges: Ensuring the consistency and reliability of LLM assessments compared to human evaluations was a challenge.
Implementation Barriers
Technical
The open-text nature of student reflections leads to variability in content and expression styles, making manual assessment complex.
Proposed Solutions: Using LLMs provides a scalable solution to analyze reflections; implementing robust prompting strategies enhances assessment quality.
Validation
Verifying the consistency of LLM-generated assessments compared to human evaluations is essential for effectiveness.
Proposed Solutions: Incorporating human labels to validate LLM assessments ensures accuracy and reliability.
Project Team
Gen Li
Researcher
Li Chen
Researcher
Cheng Tang
Researcher
Valdemar Švábenský
Researcher
Daisuke Deguchi
Researcher
Takayoshi Yamashita
Researcher
Atsushi Shimada
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Gen Li, Li Chen, Cheng Tang, Valdemar Švábenský, Daisuke Deguchi, Takayoshi Yamashita, Atsushi Shimada
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai