Skip to main content Skip to navigation

Using Generative Text Models to Create Qualitative Codebooks for Student Evaluations of Teaching

Project Overview

The document explores the transformative role of generative AI in education, emphasizing the application of natural language processing (NLP) and large language models (LLMs) to improve the analysis of student evaluations of teaching (SETs). It presents an innovative workflow named EECS (Extract, Embed, Cluster, Summarize), which automates the qualitative analysis process, enabling the creation of a thematic codebook from extensive datasets of SETs. This approach significantly enhances the accuracy and efficiency of qualitative research, providing deeper insights into teaching effectiveness while reducing the time and labor traditionally required for such analyses. The findings indicate that this methodology not only streamlines the evaluation process in educational contexts but also holds promise for broader applications across various types of qualitative data, thereby positioning generative AI as a valuable tool in educational research.

Key Applications

EECS workflow for analyzing student evaluations of teaching

Context: University setting, focusing on large enrollment courses and administrative records.

Implementation: Automated process using NLP and LLMs to extract, embed, cluster, and summarize qualitative feedback from SETs.

Outcomes: Generated a thematic codebook that mirrors traditional qualitative analysis, enabling efficient data processing and a deeper understanding of student feedback.

Challenges: Ensuring the accuracy of generated themes, handling ambiguities in qualitative data, scaling the process for larger datasets, and maintaining researcher oversight.

Implementation Barriers

Technical barrier

The complexity of implementing NLP and LLM techniques may require specialized knowledge and resources. Additionally, the interpretation of codes generated by AI may vary, leading to potential misalignment with research goals.

Proposed Solutions: Providing training for researchers on using generative AI tools, developing user-friendly software solutions, and involving human experts to review and refine the generated codes to ensure proper contextual understanding.

Data quality barrier

Qualitative data can often be ill-structured or poorly formatted, making analysis challenging.

Proposed Solutions: Implementing pre-processing steps to clean and standardize data before applying NLP techniques.

Project Team

Andrew Katz

Researcher

Mitchell Gerhardt

Researcher

Michelle Soledad

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Andrew Katz, Mitchell Gerhardt, Michelle Soledad

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies