Skip to main content Skip to navigation

Exploring the Effectiveness of GPT Models in Test-Taking: A Case Study of the Driver's License Knowledge Test

Project Overview

The document examines the role of generative AI, specifically GPT models, in enhancing educational outcomes, focusing on their application in test-taking scenarios through an analysis of the California Driver’s Handbook. It reveals that providing contextual information significantly boosts the accuracy of the model's responses, evidenced by the GPT-3 model achieving a 96% passing score on a test with relevant context, in contrast to an 82% score without it. This research underscores the critical impact of context length and format on performance, while also acknowledging the limitations of GPT models, such as their propensity to hallucinate or misinterpret data. These findings highlight the potential of generative AI in educational settings, advocating for continued advancements to optimize its effectiveness and reliability in learning environments.

Key Applications

GPT-3 model for question-answering

Context: Assessing knowledge for the California Driver’s License test, targeting learners preparing for the driving knowledge test.

Implementation: The methodology involved preprocessing contextual information, embedding queries and contexts, and generating answers using the GPT-3 model with varying prompt lengths and formats.

Outcomes: The model achieved a passing score of 96% with context, significantly improving from 82% without context.

Challenges: The model struggled with certain questions, indicating limitations in context understanding and sensitivity to text formatting.

Implementation Barriers

Technical Limitation

The model's performance is influenced by its reliance on the context provided, and it can hallucinate incorrect answers. Its effectiveness is also limited by text formatting issues.

Proposed Solutions: Implementing strategies to optimize context integration and address text formatting issues to enhance accuracy.

Data Dependency

The effectiveness of the model is limited to the quality and relevance of the contextual information provided.

Proposed Solutions: Curating high-quality, structured contextual data specifically tailored to the questions being asked.

Project Team

Saba Rahimi

Researcher

Tucker Balch

Researcher

Manuela Veloso

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Saba Rahimi, Tucker Balch, Manuela Veloso

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies