Leveraging LLMs to Assess Tutor Moves in Real-Life Dialogues: A Feasibility Study

Project Overview

The document examines the application of generative AI, particularly large language models (LLMs), in evaluating tutoring practices within real-world educational settings. It focuses on a study that investigates the effectiveness of tutoring by analyzing transcripts from college students who tutor middle schoolers in mathematics. The findings reveal that LLMs can accurately identify effective tutoring strategies, such as offering praise and appropriately addressing student errors, thereby demonstrating their potential to enhance the assessment of tutoring effectiveness. However, the document also addresses various challenges and limitations associated with the integration of LLMs in educational contexts, suggesting that while these technologies offer promising tools for improving educational outcomes, careful consideration must be given to their implementation. Overall, the use of generative AI in this capacity holds significant implications for personalizing and enhancing tutoring practices in education.

Key Applications

Using large language models (LLMs) like GPT-4, GPT-4o, GPT-4-turbo, Gemini-1.5-pro, and LearnLM to assess tutor moves

Context: Remote math tutoring sessions between college students and middle school students

Implementation: Analyzed 50 audio transcriptions of tutoring sessions to assess tutor effectiveness in providing praise and responding to errors using various LLMs.

Outcomes: Models achieved high accuracy in detecting tutoring moves (e.g., 94-98% accuracy for praise) and evaluating adherence to best practices (e.g., 83-89% accuracy).

Challenges: LLMs face limitations such as black-box nature, potential hallucinations, and difficulty in nuanced evaluation of tutor feedback.

Implementation Barriers

Technical

LLMs exhibit black-box behavior, raising concerns about transparency and reliability.

Proposed Solutions: Implementing prompt engineering techniques and using self-consistency methods to improve accuracy and reliability.

Ethical

Ethical considerations regarding the use of AI in education and the potential for biased evaluations.

Proposed Solutions: Addressing ethical implications through careful design and implementation of AI systems in educational contexts.

Project Team

Danielle R. Thomas

Researcher

Conrad Borchers

Researcher

Jionghao Lin

Researcher

Sanjit Kakarla

Researcher

Shambhavi Bhushan

Researcher

Erin Gatz

Researcher

Shivang Gupta

Researcher

Ralph Abboud

Researcher

Kenneth R. Koedinger

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Danielle R. Thomas, Conrad Borchers, Jionghao Lin, Sanjit Kakarla, Shambhavi Bhushan, Erin Gatz, Shivang Gupta, Ralph Abboud, Kenneth R. Koedinger

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

← Back to Projects