Using Generative Text Models to Create Qualitative Codebooks for Student Evaluations of Teaching
Project Overview
The document explores the transformative role of generative AI in education, emphasizing the application of natural language processing (NLP) and large language models (LLMs) to improve the analysis of student evaluations of teaching (SETs). It presents an innovative workflow named EECS (Extract, Embed, Cluster, Summarize), which automates the qualitative analysis process, enabling the creation of a thematic codebook from extensive datasets of SETs. This approach significantly enhances the accuracy and efficiency of qualitative research, providing deeper insights into teaching effectiveness while reducing the time and labor traditionally required for such analyses. The findings indicate that this methodology not only streamlines the evaluation process in educational contexts but also holds promise for broader applications across various types of qualitative data, thereby positioning generative AI as a valuable tool in educational research.
Key Applications
EECS workflow for analyzing student evaluations of teaching
Context: University setting, focusing on large enrollment courses and administrative records.
Implementation: Automated process using NLP and LLMs to extract, embed, cluster, and summarize qualitative feedback from SETs.
Outcomes: Generated a thematic codebook that mirrors traditional qualitative analysis, enabling efficient data processing and a deeper understanding of student feedback.
Challenges: Ensuring the accuracy of generated themes, handling ambiguities in qualitative data, scaling the process for larger datasets, and maintaining researcher oversight.
Implementation Barriers
Technical barrier
The complexity of implementing NLP and LLM techniques may require specialized knowledge and resources. Additionally, the interpretation of codes generated by AI may vary, leading to potential misalignment with research goals.
Proposed Solutions: Providing training for researchers on using generative AI tools, developing user-friendly software solutions, and involving human experts to review and refine the generated codes to ensure proper contextual understanding.
Data quality barrier
Qualitative data can often be ill-structured or poorly formatted, making analysis challenging.
Proposed Solutions: Implementing pre-processing steps to clean and standardize data before applying NLP techniques.
Project Team
Andrew Katz
Researcher
Mitchell Gerhardt
Researcher
Michelle Soledad
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Andrew Katz, Mitchell Gerhardt, Michelle Soledad
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai