Open, Small, Rigmarole -- Evaluating Llama 3.2 3B's Feedback for Programming Exercises
Project Overview
The document examines the application of Generative AI (GenAI) in education, specifically through the lens of small, open large language models (LLMs) like Llama 3.2, which are employed to provide formative feedback to novice programming learners. It emphasizes the advantages of utilizing smaller models, such as enhanced data protection and improved accessibility for users. The research delves into the quality of feedback produced by these models, while also addressing their limitations. Key challenges identified include issues related to the accuracy, consistency, and clarity of the feedback generated, which can hinder the learning process for beginner programmers. Overall, the findings suggest that while GenAI has the potential to support educational practices, careful consideration must be given to the model's output to ensure it effectively meets the needs of learners.
Key Applications
Llama 3.2 for formative programming feedback
Context: Introductory programming course for university students, specifically targeting novice learners in Java programming.
Implementation: The Llama 3.2 model was used to generate feedback on student submissions for programming tasks. The feedback was qualitatively analyzed based on a set of predefined categories.
Outcomes: The study found that while Llama 3.2 can provide some useful feedback, it often lacks accuracy, with a majority of feedback being only partially correct. It identified common programming errors and offered code suggestions.
Challenges: Generated feedback often included inconsistencies, redundancies, and inaccuracies, making it difficult for novice programmers to understand and apply the suggestions.
Implementation Barriers
Feedback Quality
The feedback generated by Llama 3.2 is often inconsistent and contains inaccuracies, leading to confusion among students. Students have expressed concerns regarding the usability and clarity of the feedback provided by the AI, including the presence of errors and misleading advice.
Proposed Solutions: Future research is needed to refine models and improve the reliability of feedback mechanisms. Enhancing the training of models and including user testing phases to better align feedback with student needs could help mitigate these issues.
Project Team
Imen Azaiz
Researcher
Natalie Kiesler
Researcher
Sven Strickroth
Researcher
Anni Zhang
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Imen Azaiz, Natalie Kiesler, Sven Strickroth, Anni Zhang
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai