Debugging Without Error Messages: How LLM Prompting Strategy Affects Programming Error Explanation Effectiveness
Project Overview
This document examines the role of generative AI, particularly large language models (LLMs) like GPT-3.5, in enhancing educational experiences, specifically in programming error correction. It highlights the common frustration experienced by novice programmers due to traditional error messages, which often lack clarity. By utilizing LLMs, educators can provide more comprehensible explanations for programming errors when these models are given appropriate context. The research evaluates various prompting strategies, including baseline, one-shot, and fine-tuning techniques, to determine their effectiveness in improving the quality of error explanations. The findings indicate that while LLMs are capable of generating valuable feedback, the choice of prompting strategy does not significantly influence the accuracy of the responses; however, it does impact their conciseness. Overall, the study underscores the potential of generative AI in education, particularly in making technical concepts more accessible to learners, thus highlighting its importance as a tool for enhancing understanding in programming.
Key Applications
LLM-generated error message explanations
Context: Educational context focused on programming error explanations for novice programmers using TigerJython, a pedagogical programming language.
Implementation: Utilized various prompting strategies (baseline, one-shot, fine-tuning) with GPT-3.5 to evaluate error explanation effectiveness.
Outcomes: Found that 2-3 useful explanations were generated for every misleading response, and fine-tuning reduced extraneous information.
Challenges: Maintaining a diverse dataset of programming errors for effective fine-tuning.
Implementation Barriers
Data diversity
Programming errors have a long-tail distribution, leading to challenges in finding diverse examples for fine-tuning the model.
Proposed Solutions: Ensure training datasets contain a broad range of programming errors to avoid overfitting on common mistakes.
Cognitive load
Extraneous information in LLM responses can increase cognitive load for students, especially novices.
Proposed Solutions: Focus on training the model to provide concise and relevant explanations to minimize distractions.
Project Team
Audrey Salmon
Researcher
Katie Hammer
Researcher
Eddie Antonio Santos
Researcher
Brett A. Becker
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Audrey Salmon, Katie Hammer, Eddie Antonio Santos, Brett A. Becker
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai