Skip to main content Skip to navigation

ACE-RLHF: Automated Code Evaluation and Socratic Feedback Generation Tool using Large Language Models and Reinforcement Learning with Human Feedback

Project Overview

This document explores the use of generative AI in education, particularly through the development and evaluation of an Automated Code Evaluation and Socratic Feedback Generation Tool (ACE-RLHF), which leverages Large Language Models (LLMs) and Reinforcement Learning with Human Feedback (RLHF). The tool is designed to enhance programming education by delivering personalized feedback and fostering active learning via Socratic questioning techniques. Its implementation demonstrated superior performance relative to existing methods, significantly improving student engagement and comprehension of programming concepts. However, the document also addresses ongoing challenges, including issues related to hallucinations and the trade-offs concerning accuracy. Overall, the findings suggest that while generative AI tools like ACE-RLHF can positively impact educational outcomes, further refinement is necessary to mitigate associated challenges.

Key Applications

ACE-RLHF: Automated Code Evaluation and Socratic Feedback Generation Tool

Context: Used in programming education for students learning coding.

Implementation: Developed using LLMs fine-tuned with RLHF for generating feedback and guiding students through Socratic questioning.

Outcomes: Improved accuracy in code feedback, enhanced student engagement, and active learning.

Challenges: LLMs may produce factually incorrect answers (hallucinations) and the complexity of the model can lead to trade-offs in performance.

Implementation Barriers

Technical Barrier

LLMs often generate hallucinated or incorrect answers, which can mislead students. The trade-off between model complexity and accuracy may limit practical applications.

Proposed Solutions: Implementing more complex reward models, such as ensemble reward models, to improve accuracy. Exploring simpler architectures unless significant performance improvements are achieved.

Project Team

Tasnia Rahman

Researcher

Sathish A. P. Kumar

Researcher

Sumit Jha

Researcher

Arvind Ramanathan

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Tasnia Rahman, Sathish A. P. Kumar, Sumit Jha, Arvind Ramanathan

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies