Skip to main content Skip to navigation

AI and the FCI: Can ChatGPT Project an Understanding of Introductory Physics?

Project Overview

The document examines the role of generative AI, specifically ChatGPT, in enhancing introductory physics education by assessing its understanding of fundamental concepts like kinematics and Newtonian dynamics through the Force Concept Inventory (FCI). It compares the performance of two versions of ChatGPT, 3.5 and 4, revealing that while ChatGPT 3.5 performs at a level akin to a B-level university student, ChatGPT 4 demonstrates substantial improvement, often providing expert-level responses. This evaluation underscores the potential of generative AI not only as a powerful educational tool but also raises important considerations regarding educational integrity and assessment practices. The findings suggest that integrating advanced AI technologies in educational settings could enrich learning experiences, foster deeper understanding of complex subjects, and potentially transform traditional teaching methodologies.

Key Applications

Using ChatGPT as an assessment and response simulation tool

Context: University first-semester physics course students and physics instructors preparing teaching materials, where ChatGPT is utilized to assess understanding and simulate novice student responses.

Implementation: ChatGPT 3.5 and ChatGPT 4 were administered modified assessments such as the Force Concept Inventory (FCI) and prompted to respond as novice students to predict misconceptions, revealing insights into student understanding and common errors.

Outcomes: ChatGPT 3.5 scored 65% and ChatGPT 4 scored 95% on the FCI, showing varying levels of understanding. While ChatGPT 3.5 matched novice student responses on some items, it struggled with spatial reasoning and some conceptual questions. ChatGPT 4 performed significantly better on assessments but often provided expert-level responses rather than reflecting true novice reasoning.

Challenges: ChatGPT struggled to accurately represent the reasoning of true novices, often providing expert-level responses instead. Additionally, the models faced challenges with spatial reasoning and some specific conceptual questions.

Implementation Barriers

Technological limitations

ChatGPT's inability to process visual information limits its performance on questions requiring spatial reasoning.

Proposed Solutions: Future AI models may need to incorporate multimodal capabilities to address visual reasoning.

Educational integrity concerns

The potential for students to misuse AI tools like ChatGPT in assessments raises concerns about cheating.

Proposed Solutions: Educators may need to adapt assessment strategies to account for AI use, such as designing questions that require deeper understanding or application.

Project Team

Colin G. West

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Colin G. West

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies