Assessing GPT Performance in a Proof-Based University-Level Course Under Blind Grading

Project Overview

The document explores the role of generative AI, particularly large language models (LLMs) like GPT-4o and o1-preview, in educational settings, specifically in a university algorithms course. It evaluates the performance of these models in solving complex, proof-based questions through blind grading, revealing that while o1-preview outperforms GPT-4o, both models face significant challenges in logical reasoning, resulting in unjustified claims and misleading arguments. This highlights the necessity for robust assessment strategies that consider the strengths and weaknesses of AI. Furthermore, the document discusses broader applications of generative AI in education, including automated grading systems, personalized learning experiences, and AI-driven tutoring. It underscores the potential benefits of these technologies, such as increased engagement and efficiency in learning, while also acknowledging challenges like algorithmic bias and the need for proper training for educators to effectively incorporate AI into their teaching practices. Overall, the findings suggest that while generative AI has transformative potential in education, careful consideration of its implementation and assessment is crucial to maximize its effectiveness and address inherent limitations.

Key Applications

Automated assessment and personalized learning

Context: Higher education institutions, online learning platforms, and K-12 education settings aimed at enhancing student engagement and learning outcomes for both students and educators.

Implementation: Leveraging AI algorithms for automated grading of student submissions, blind grading of AI-generated solutions, and providing personalized feedback and learning paths tailored to individual student needs and learning styles.

Outcomes: ['Increased efficiency in grading, allowing instructors to focus on personalized support for students.', 'Improved student motivation and academic performance through customized learning experiences.', 'Enhanced understanding of subject material and support for diverse learning needs.']

Challenges: ['Potential bias in grading algorithms and the need for transparency in AI decision-making processes.', 'Data privacy concerns and the need for robust student data management systems.', 'Dependence on technology and the need for human oversight to ensure quality education.']

AI-driven tutoring systems

Context: Supplementary educational platforms for K-12 and higher education students needing additional help outside the classroom.

Implementation: Integrating AI tutors that adapt to student learning styles and pace, providing personalized support to enhance subject understanding.

Outcomes: ['Enhanced understanding of subject material and support for diverse learning needs.']

Challenges: ['Dependence on technology and the need for human oversight to ensure quality education.']

Implementation Barriers

Technical Barrier

AI models frequently produce unjustified claims and misleading arguments in their solutions. Additionally, bias in AI algorithms can lead to unfair assessment and learning experiences for students.

Proposed Solutions: Implement grading rubrics that penalize unjustified reasoning and emphasize the importance of transparent logical argumentation. Regular audits of AI systems for bias and continuous improvement through diverse data training sets.

Educational Policy Barrier

Current assessment strategies may not be effective in distinguishing between human and AI-generated work.

Proposed Solutions: Develop new assessment methods that adapt to the capabilities of AI while retaining academic integrity.

Resource Barrier

Insufficient training for educators on how to effectively use AI tools in the classroom.

Proposed Solutions: Establishing comprehensive professional development programs that focus on AI integration in teaching.

Ethical Barrier

Concerns over data privacy and security in using AI systems that collect student data.

Proposed Solutions: Implementing strict data protection policies and transparent data usage guidelines.

Project Team

Ming Ding

Researcher

Rasmus Kyng

Researcher

Federico Solda

Researcher

Weixuan Yuan

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Ming Ding, Rasmus Kyng, Federico Solda, Weixuan Yuan

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

← Back to Projects