Skip to main content Skip to navigation

AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy

Project Overview

The document examines the role of generative AI, particularly Large Language Models (LLMs), in enhancing educational outcomes through improved human judgment forecasting. It evaluates two types of LLMs: one that provides accurate, superforecasting advice and another that produces less reliable, biased predictions. Findings reveal that both LLMs significantly improve prediction accuracy compared to a control group using a basic forecasting model, with the superforecasting assistant demonstrating superior performance. This underscores the potential of LLMs to augment human decision-making in complex educational tasks, suggesting that hybrid human-AI collaboration can lead to better outcomes. The study highlights the transformative capabilities of generative AI in education, emphasizing its applications in enhancing predictive accuracy and facilitating informed decision-making processes.

Key Applications

LLM Assistants for Human Judgment Forecasting

Context: Participants in a forecasting study from various backgrounds, including finance, geopolitics, and technology.

Implementation: Participants were divided into groups using different LLM assistants (superforecasting and noisy) while answering forecasting questions.

Outcomes: Participants using LLM assistants showed a 24-28% improvement in prediction accuracy compared to the control group.

Challenges: The noisy assistant's performance raised concerns about outliers and its impact on aggregation accuracy.

Implementation Barriers

Implementation Barrier

The effectiveness of LLMs may vary based on participant familiarity with forecasting tasks, leading to inconsistent outcomes. Current LLMs may not provide consistent accuracy across various forecasting scenarios, especially in complex tasks.

Proposed Solutions: Future research should explore tailored training for users to maximize the benefits of LLM augmentation. Continued improvements in LLM technology and training data diversity may enhance accuracy and reliability.

Project Team

Philipp Schoenegger

Researcher

Peter S. Park

Researcher

Ezra Karger

Researcher

Sean Trott

Researcher

Philip E. Tetlock

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Philipp Schoenegger, Peter S. Park, Ezra Karger, Sean Trott, Philip E. Tetlock

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies