AI-Tutoring in Software Engineering Education
Project Overview
This document examines the application of GPT-3.5-Turbo as an AI tutor within the Artemis Automated Programming Assessment System in software engineering education. The study focuses on student interactions, experiences, and outcomes related to AI-driven feedback. Findings reveal diverse student interaction patterns, categorized as "Iterative Ivy" and "Hybrid Harry," and mixed student perceptions regarding the AI tutor's effectiveness. While offering benefits like timely feedback and scalability, the use of generative AI in this context also presents challenges, including the provision of generic feedback and student concerns about their learning progress. The research highlights the potential of AI tutors in education while underscoring the need to address limitations and optimize implementation strategies for improved student outcomes.
Key Applications
AI-Powered Programming Assistant and Debugging Tool (GPT-based)
Context: Programming education and general programming exercises, spanning introductory programming courses (e.g., C language) to broader software engineering tasks. Includes student assessment, bug fixing, code generation, program repair, and code summarization. Targeted at undergraduate students and general higher education.
Implementation: Utilizes GPT-3.5-Turbo, ChatGPT, and GPT-4 models to assist students with programming tasks. This includes: Integrating the AI within platforms (e.g., Artemis) to provide feedback on code, generating hints for code, and offering debugging assistance. The AI analyzes code, identifies bugs, suggests fixes, generates code, repairs programs, and summarizes code. The system takes the student's code, exercise description, and sample solution (if applicable) as input and provides feedback in a pop-up window or chat interface. Includes using the AI to solve programming bugs, and analyzing its bug fixing performance.
Outcomes: Provides timely and personalized feedback, scalable assistance, helps with logical and semantic issues in code, offers debugging assistance, bug prediction, and bug explanation to help solve programming problems. The AI can provide surprisingly helpful hints, solve a significant number of bugs (31 out of 40 in one benchmark), and hint surprisingly well to the original intention behind what a correct version of a program should look like.
Challenges: Generic feedback, potential for over-reliance on the AI, occasional hallucinations (incorrect or irrelevant feedback), API downtime issues, limitations due to token limits. Not a perfect solution, and should be seen as an additional debugging tool. Performance in some areas (e.g., software testing) may be inconsistent.
AI-Powered Hint Generation (ChatGPT)
Context: Elementary and intermediate Algebra, and general programming exercises.
Implementation: Uses ChatGPT to generate hints for solving problems. Compares the efficacy of hints authored by human tutors and hints generated by ChatGPT.
Outcomes: Can produce high-quality hints (e.g., 79% passed a manual quality test).
Challenges: N/A
AI-Powered Educational Chatbot
Context: Helping students with their coursework and providing assistance without revealing the solution.
Implementation: Combines ChatGPT with educational content libraries (e.g., Quizlet) and uses GPT-4 models to create chatbots.
Outcomes: N/A
Challenges: N/A
Implementation Barriers
Generic Feedback & Code-Specific Guidance
The AI-Tutor's responses were often perceived as too generic, lacking code-specific guidance. The AI-Tutor occasionally revealed solutions or provided incorrect or irrelevant feedback.
Proposed Solutions: Refine responses for more detailed, code-specific guidance. Provide concrete code examples to supplement written feedback. Improve prompt engineering. Integrate the AI feedback with unit test results to inform the student when their solution meets all criteria. Ensure that the model maintains adherence to the guidelines.
Lack of Interactivity
Students desired more interactive capabilities, such as asking follow-up questions.
Proposed Solutions: Change to a chat-bot based system to allow follow up questions.
Interface Concerns
The interface did not allow users to view previous feedback.
Proposed Solutions: Modify the interface to allow users to view previous feedback
Over-reliance and Learning Inhibition
Some students feared over-reliance on the AI-Tutor, potentially hindering their learning progress.
Proposed Solutions: Encourage students to use the AI-Tutor without fear, reiterating the tool's purpose to supplement traditional learning methods.
API Dependency/Downtime
Downtimes in the API could jeopardize the tutor’s functionality.
Proposed Solutions: N/A
Token Limit/Context Limits
The context limitation imposed by the token limit can affect the quality of feedback, especially for more complex or lengthy code submissions.
Proposed Solutions: Explore ways to manage this limitation effectively.
Impersonal Nature
The impersonal nature of an AI-Tutor might make the learning experience less engaging and less adaptive to individual student needs.
Proposed Solutions: N/A
Project Team
Eduard Frankford
Researcher
Clemens Sauerwein
Researcher
Patrick Bassner
Researcher
Stephan Krusche
Researcher
Ruth Breu
Researcher
Contact Information
For more information about this project or to discuss potential collaboration opportunities, please contact:
Eduard Frankford
Source Publication: View Original PaperLink opens in a new window