Skip to main content Skip to navigation

GPT-4's assessment of its performance in a USMLE-based case study

Project Overview

The document examines the application of generative AI, specifically GPT-4, in healthcare education by analyzing its performance on USMLE questions. It investigates the effect of feedback on the model's confidence and accuracy in answering these questions. The findings reveal that although feedback can alter the model's confidence levels, it does not reliably enhance its performance. This underscores the need for a deeper understanding of large language models (LLMs) in high-stakes domains like healthcare, where the interplay between AI confidence and user trust is crucial for effective decision-making. Overall, the study emphasizes the potential of generative AI in educational settings while also highlighting the complexities of its integration and the importance of fostering trust in AI-generated responses.

Key Applications

Assessment of GPT-4 using USMLE questions

Context: Medical education, targeting medical students and educators

Implementation: GPT-4 was prompted with USMLE questions, evaluated for confidence and accuracy with and without feedback.

Outcomes: The model achieved 88% accuracy with feedback and 92% without feedback; feedback influenced confidence levels but not necessarily accuracy.

Challenges: The model exhibited variability in confidence levels, particularly when feedback was provided, leading to potential overconfidence or miscalibration.

Implementation Barriers

Ethical

Concerns regarding informed consent, algorithmic fairness, and data privacy in AI applications in healthcare.

Proposed Solutions: The document suggests developing ethical frameworks and best practices to guide the integration of AI in clinical decision-making.

Technical

Variability in model confidence based on feedback, which may lead to overconfidence or underconfidence.

Proposed Solutions: Future research should explore better calibration methods and feedback mechanisms to enhance AI effectiveness.

Project Team

Uttam Dhakal

Researcher

Aniket Kumar Singh

Researcher

Suman Devkota

Researcher

Yogesh Sapkota

Researcher

Bishal Lamichhane

Researcher

Suprinsa Paudyal

Researcher

Chandra Dhakal

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Uttam Dhakal, Aniket Kumar Singh, Suman Devkota, Yogesh Sapkota, Bishal Lamichhane, Suprinsa Paudyal, Chandra Dhakal

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies