Connecting Feedback to Choice: Understanding Educator Preferences in GenAI vs. Human-Created Lesson Plans in K-12 Education -- A Comparative Analysis
Project Overview
This study explores the integration of generative AI (GenAI) in K-12 math education, specifically examining educator preferences for lesson plans created by AI models like GPT-4 and LLaMA-2-13b compared to human-designed plans. The research evaluates lesson plans across different instructional measures, revealing that human-authored plans are generally preferred, particularly in elementary education. However, AI-generated plans demonstrate increasing proficiency, excelling in areas like cool-down tasks and structured learning activities, especially in high school settings. The findings emphasize a collaborative approach, suggesting that GenAI can serve as a valuable tool to assist educators in curriculum development, rather than a complete replacement. The study highlights both the potential benefits and challenges of incorporating AI into the educational landscape.
Key Applications
AI-Powered Lesson Plan Generation
Context: K-12 math education
Implementation: Utilizes AI models (GPT-4, LLaMA-2-13b) for lesson plan creation. Implementation involves prompt engineering (for GPT-4) and fine-tuning on expert-designed lesson plans (for LLaMA-2-13b).
Outcomes: Demonstrated competitiveness in cool-down activities and structured learning activities, particularly in high school settings. Excelled in content depth and structured content delivery at the high school level.
Challenges: Less favored than human-created plans overall, especially in elementary education. Lacked the nuanced differentiation, real-world contextualization, and student discourse facilitation of human-authored plans. Fine-tuned models were also slightly less favored than human-created content overall.
Human-Created Lesson Plans
Context: K-12 math education
Implementation: Lesson plans created by experienced curriculum designers.
Outcomes: Generally favored, especially in elementary education. Valued for nuanced differentiation, real-world contextualization, student discourse facilitation, and adaptability.
Challenges: Time-consuming and cognitively demanding to create. Early-career educators face considerable challenges in designing lesson plans that balance content rigor, pedagogical effectiveness, and classroom adaptability.
Implementation Barriers
Lack of domain-specific knowledge and factual accuracy
GenAI models lack domain-specific pedagogical knowledge, leading to content that may be factually incorrect, pedagogically misaligned, or culturally insensitive. Educators report issues such as insufficient depth in subject matter and frequent errors in STEM-related content.
Proposed Solutions: Fine-tuning models with domain-specific datasets and incorporating education feedback loops to enhance AI-generated lesson plans. Integrating AI with professional curriculum design expertise. Fine-tuning models for subject-specific content.
Unpredictability and lack of adherence to educational standards
The unpredictability of AI-generated lesson plans raises concerns about their reliability, relevance, and adherence to educational standards.
Proposed Solutions: Rigorous evaluation of AI-generated outputs to ensure pedagogical soundness and contextual relevance. Refining GenAI models to better align with teacher preferences by incorporating user engagement data into model training.
Lack of nuanced differentiation, real-world contextualization, and student discourse facilitation
AI-generated plans often lack the nuanced differentiation, real-world contextualization, and student discourse facilitation of human-authored plans, particularly in elementary education.
Proposed Solutions: Refining GenAI models to incorporate user engagement data into model training, emphasizing elements such as scaffolded support, differentiated instruction, and interactive learning strategies.
Generalizability limitations
The study's findings may not be universally applicable due to the geographic, demographic, and contextual diversity of the educator sample. Pedagogical priorities may differ across regions, school types, and educational systems, which could influence educator preferences for AI-generated content. Educator biases, familiarity with AI tools, and subjective experiences may have shaped their evaluations.
Proposed Solutions: Expand participant diversity across regions, subjects, and teaching styles. Conduct longitudinal studies to track AI’s long-term impact on student learning outcomes.
Project Team
Shawon Sarkar
Researcher
Min Sun
Researcher
Alex Liu
Researcher
Zewei Tian
Researcher
Lief Esbenshade
Researcher
Jian He
Researcher
Zachary Zhang
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Shawon Sarkar, Min Sun, Alex Liu, Zewei Tian, Lief Esbenshade, Jian He, Zachary Zhang
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gemini-2.0-flash-lite