Skip to main content Skip to navigation

A Large-Scale Real-World Evaluation of LLM-Based Virtual Teaching Assistant

Project Overview

The document explores the implementation and assessment of a Large Language Model (LLM)-based Virtual Teaching Assistant (VTA) in a graduate-level AI programming course with 477 students, focusing on its impact on student learning and engagement. Key applications of the VTA include providing instant feedback and facilitating interactions, which have led to improved student perceptions regarding its helpfulness, trustworthiness, and comfort over time, especially among those initially reluctant to engage with human instructors. Despite these positive outcomes, the study identifies challenges such as the VTA's limitations in reliability, depth of support compared to human instructors, and occasional issues with response time and accuracy. Overall, the findings highlight the potential of generative AI to enhance educational experiences while also underscoring the need for ongoing improvements to address its limitations.

Key Applications

Large Language Model-based Virtual Teaching Assistant (VTA)

Context: Graduate-level introductory AI programming course

Implementation: The VTA was integrated into the course to provide instant feedback and answer student inquiries through a web interface. It utilized LLMs to generate contextually relevant responses based on course materials.

Outcomes: Improved student engagement, increased comfort in asking questions, and reported benefits in perceived helpfulness and trustworthiness of the VTA compared to human instructors.

Challenges: Limitations in reliability and depth of support compared to human instructors, slow response times, and occasional inaccuracies in answers.

Implementation Barriers

Technological

Slow response time perceived by students, which may relate to the lack of output streaming and the way responses were delivered.

Proposed Solutions: Incorporating streaming functionality to provide partial responses as they are generated to improve user experience.

Perceptual

Students found the VTA less reliable than human instructors, impacting their trust in the system.

Proposed Solutions: Improving the underlying LLM architecture and enhancing instruction-following capabilities to boost trustworthiness.

Content-related

Difficulty in retrieving course-related content and issues with hallucinated or incorrect answers.

Proposed Solutions: Adopting hybrid retrieval strategies and expanding the document candidate pool to improve content retrieval accuracy.

Project Team

Sunjun Kweon

Researcher

Sooyohn Nam

Researcher

Hyunseung Lim

Researcher

Hwajung Hong

Researcher

Edward Choi

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Sunjun Kweon, Sooyohn Nam, Hyunseung Lim, Hwajung Hong, Edward Choi

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies