Skip to main content Skip to navigation

ChatGPT as a Solver and Grader of Programming Exams written in Spanish

Project Overview

The document examines the role of generative AI, particularly ChatGPT, in the context of education, focusing on its application in solving and grading programming exams for a BSc degree in Computer Science conducted in Spanish. The findings reveal that ChatGPT demonstrates reasonable proficiency in addressing simple coding tasks but struggles with more complex programming challenges and accurately evaluating solutions. The research highlights the necessity for further studies by introducing a novel corpus of programming tasks and prompts, indicating a potential pathway for enhancing AI's capabilities in educational settings. Overall, while generative AI shows promise in supporting educational assessments, its current limitations underscore the need for ongoing development and refinement to fully harness its potential in the academic landscape.

Key Applications

ChatGPT as a problem solver and grader for programming exams

Context: Evaluating programming exams from a BSc degree in Computer Science, targeting first-year students.

Implementation: ChatGPT was tasked with solving real exam questions and grading student solutions using OpenAI’s API with two types of prompts.

Outcomes: ChatGPT scored above the passing threshold for the exam but struggled with complex problems and grading accuracy. A new corpus of programming tasks was created.

Challenges: Inaccurate grading and limited effectiveness on complex tasks.

Implementation Barriers

Performance Limitation

ChatGPT struggles with complex programming tasks and has difficulty evaluating solutions accurately.

Proposed Solutions: Future work could explore better prompting strategies and the use of more advanced models like GPT-4.

Language Barrier

The evaluation was limited to Spanish, which may affect the model's performance compared to English.

Proposed Solutions: Further research is needed to assess LLMs in various languages to understand performance discrepancies.

Project Team

Pablo Saborido-Fernández

Researcher

Marcos Fernández-Pichel

Researcher

David E. Losada

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Pablo Saborido-Fernández, Marcos Fernández-Pichel, David E. Losada

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies