Skip to main content Skip to navigation

Open, Small, Rigmarole -- Evaluating Llama 3.2 3B's Feedback for Programming Exercises

Project Overview

The document examines the application of Generative AI (GenAI) in education, specifically through the lens of small, open large language models (LLMs) like Llama 3.2, which are employed to provide formative feedback to novice programming learners. It emphasizes the advantages of utilizing smaller models, such as enhanced data protection and improved accessibility for users. The research delves into the quality of feedback produced by these models, while also addressing their limitations. Key challenges identified include issues related to the accuracy, consistency, and clarity of the feedback generated, which can hinder the learning process for beginner programmers. Overall, the findings suggest that while GenAI has the potential to support educational practices, careful consideration must be given to the model's output to ensure it effectively meets the needs of learners.

Key Applications

Llama 3.2 for formative programming feedback

Context: Introductory programming course for university students, specifically targeting novice learners in Java programming.

Implementation: The Llama 3.2 model was used to generate feedback on student submissions for programming tasks. The feedback was qualitatively analyzed based on a set of predefined categories.

Outcomes: The study found that while Llama 3.2 can provide some useful feedback, it often lacks accuracy, with a majority of feedback being only partially correct. It identified common programming errors and offered code suggestions.

Challenges: Generated feedback often included inconsistencies, redundancies, and inaccuracies, making it difficult for novice programmers to understand and apply the suggestions.

Implementation Barriers

Feedback Quality

The feedback generated by Llama 3.2 is often inconsistent and contains inaccuracies, leading to confusion among students. Students have expressed concerns regarding the usability and clarity of the feedback provided by the AI, including the presence of errors and misleading advice.

Proposed Solutions: Future research is needed to refine models and improve the reliability of feedback mechanisms. Enhancing the training of models and including user testing phases to better align feedback with student needs could help mitigate these issues.

Project Team

Imen Azaiz

Researcher

Natalie Kiesler

Researcher

Sven Strickroth

Researcher

Anni Zhang

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Imen Azaiz, Natalie Kiesler, Sven Strickroth, Anni Zhang

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies