Skip to main content Skip to navigation

Human-AI Learning Performance in Multi-Armed Bandits

Project Overview

The document examines the role of generative AI in education, particularly through the lens of human-agent collaboration in decision-making tasks using a multi-armed bandit framework. It emphasizes how AI agents can significantly enhance human learning by offering tailored suggestions, especially in collaborative learning environments where both humans and AI are evolving their strategies. The findings indicate that when human and AI agents work together, their combined performance surpasses that of either agent working independently. This highlights the critical need for effective alignment of AI strategies with human exploration tendencies to maximize educational outcomes. Overall, the document underscores the potential of generative AI to foster more effective learning experiences by creating synergistic partnerships that leverage the strengths of both human learners and AI systems.

Key Applications

Multi-armed bandit problem with human-agent collaboration

Context: Educational settings where humans are learning decision-making tasks with AI assistance; target audience includes students or participants in learning scenarios.

Implementation: Participants engaged in a user study where they played a game with multiple slot machines, receiving suggestions from different AI agents. The study analyzed how these suggestions impacted their decision-making.

Outcomes: Human-agent teams often performed better than individuals alone, with the right agent enhancing performance significantly. Teams could outperform the best individual performer.

Challenges: Agent performance did not correlate with team performance in a straightforward manner; a drop in agent performance could sometimes improve team performance, indicating the complexity of the interaction.

Implementation Barriers

Conceptual Barrier

The assumption that better agent performance in isolation directly leads to better human-agent team performance is flawed.

Proposed Solutions: Future work should explore models of human internal states to optimize agent suggestions and enhance collaborative learning.

Project Team

Ravi Pandya

Researcher

Sandy H. Huang

Researcher

Dylan Hadfield-Menell

Researcher

Anca D. Dragan

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Ravi Pandya, Sandy H. Huang, Dylan Hadfield-Menell, Anca D. Dragan

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies