Leveraging LLMs to Assess Tutor Moves in Real-Life Dialogues: A Feasibility Study
Project Overview
The document examines the application of generative AI, particularly large language models (LLMs), in evaluating tutoring practices within real-world educational settings. It focuses on a study that investigates the effectiveness of tutoring by analyzing transcripts from college students who tutor middle schoolers in mathematics. The findings reveal that LLMs can accurately identify effective tutoring strategies, such as offering praise and appropriately addressing student errors, thereby demonstrating their potential to enhance the assessment of tutoring effectiveness. However, the document also addresses various challenges and limitations associated with the integration of LLMs in educational contexts, suggesting that while these technologies offer promising tools for improving educational outcomes, careful consideration must be given to their implementation. Overall, the use of generative AI in this capacity holds significant implications for personalizing and enhancing tutoring practices in education.
Key Applications
Using large language models (LLMs) like GPT-4, GPT-4o, GPT-4-turbo, Gemini-1.5-pro, and LearnLM to assess tutor moves
Context: Remote math tutoring sessions between college students and middle school students
Implementation: Analyzed 50 audio transcriptions of tutoring sessions to assess tutor effectiveness in providing praise and responding to errors using various LLMs.
Outcomes: Models achieved high accuracy in detecting tutoring moves (e.g., 94-98% accuracy for praise) and evaluating adherence to best practices (e.g., 83-89% accuracy).
Challenges: LLMs face limitations such as black-box nature, potential hallucinations, and difficulty in nuanced evaluation of tutor feedback.
Implementation Barriers
Technical
LLMs exhibit black-box behavior, raising concerns about transparency and reliability.
Proposed Solutions: Implementing prompt engineering techniques and using self-consistency methods to improve accuracy and reliability.
Ethical
Ethical considerations regarding the use of AI in education and the potential for biased evaluations.
Proposed Solutions: Addressing ethical implications through careful design and implementation of AI systems in educational contexts.
Project Team
Danielle R. Thomas
Researcher
Conrad Borchers
Researcher
Jionghao Lin
Researcher
Sanjit Kakarla
Researcher
Shambhavi Bhushan
Researcher
Erin Gatz
Researcher
Shivang Gupta
Researcher
Ralph Abboud
Researcher
Kenneth R. Koedinger
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Danielle R. Thomas, Conrad Borchers, Jionghao Lin, Sanjit Kakarla, Shambhavi Bhushan, Erin Gatz, Shivang Gupta, Ralph Abboud, Kenneth R. Koedinger
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai