Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach
Project Overview
The document explores the role of generative AI, particularly through the development of LearnLM-Tutor, a conversational AI designed to provide personalized educational support. It underscores the necessity of translating pedagogical intuitions into effective AI prompts and developing robust evaluation practices to optimize AI for educational contexts, while also addressing the mixed reception of generative AI in the field. Key applications of AI tutors include offering learners a safe space for inquiry and motivation, yet challenges such as biased outputs, privacy concerns, and the potential erosion of learner autonomy are significant risks that must be managed. The document outlines policies for the responsible development of AI tutors, emphasizing the importance of safety evaluations and continuous refinement to ensure that AI-generated content is pedagogically sound. It also details an evaluation-driven approach to assessing tutor quality through rating experiments, focusing on the significance of clear feedback, engagement, and adaptability to meet diverse student needs. The findings suggest that while generative AI holds promise for enhancing educational experiences, responsible implementation, collaboration with educators, and ongoing evaluation are crucial for maximizing its positive impact while minimizing potential harms.
Key Applications
LearnLM-Tutor
Context: AI tutoring for learners in various educational settings, including programming courses at Arizona State University, academic support, and interactions with educational videos. The AI engages learners through conversational methods, providing support tailored to their specific needs and learning contexts.
Implementation: LearnLM-Tutor was developed through participatory methods, incorporating supervised fine-tuning with pedagogically informed data. The development process involved user feedback and safety evaluations to enhance its effectiveness and adaptability in educational scenarios.
Outcomes: The implementation has led to improved learner engagement, satisfaction, and pedagogical effectiveness, showing higher ratings on clarity, engagement, and adaptability in AI tutor interactions. There is a reported fostering of a supportive learning environment.
Challenges: Challenges faced include mitigating risks of biased outputs, preventing premature direct answers, managing potential emotional dependency on the AI, and accurately assessing the complexity of human-like pedagogical qualities.
Implementation Barriers
Technical Barrier
Difficulty in verbalizing pedagogical intuitions into effective AI prompts and accurately simulating human-like pedagogical interactions.
Proposed Solutions: Developing a comprehensive set of pedagogical benchmarks, collaborating with educators to inform AI development, utilizing comprehensive evaluation criteria, and making iterative improvements based on learner feedback.
Evaluation Barrier
Lack of established evaluation practices for assessing AI in educational contexts.
Proposed Solutions: Creating a suite of pedagogical benchmarks that include both quantitative and qualitative measures to assess AI tutors.
Access Barrier
Equitable access to quality education and AI tools remains a significant challenge.
Proposed Solutions: Working towards democratizing access to AI technology and ensuring it meets the needs of diverse learners.
Technical
The AI tutor may produce biased or incorrect outputs, leading to misinformation.
Proposed Solutions: Implementing safety fine-tuning, regular evaluations against bias benchmarks, and curating high-quality training data.
Privacy and Ethical
Concerns over data privacy and potential surveillance of student interactions.
Proposed Solutions: Establishing strict data usage policies, ensuring anonymization of user data, and conducting transparency audits.
Pedagogical
Difficulty in maintaining effective pedagogical practices in AI responses, such as promoting critical thinking.
Proposed Solutions: Incorporating feedback from educators and continuous adjustments to the AI's pedagogical strategies.
Engagement Barrier
Learner engagement can be inconsistent, affecting the quality of interactions.
Proposed Solutions: Implementing strategies to promote active learning and adapt to the student's emotional state.
Project Team
Irina Jurenka
Researcher
Markus Kunesch
Researcher
Kevin R. McKee
Researcher
Daniel Gillick
Researcher
Shaojian Zhu
Researcher
Sara Wiltberger
Researcher
Shubham Milind Phal
Researcher
Katherine Hermann
Researcher
Daniel Kasenberg
Researcher
Avishkar Bhoopchand
Researcher
Ankit Anand
Researcher
Miruna Pîslar
Researcher
Stephanie Chan
Researcher
Lisa Wang
Researcher
Jennifer She
Researcher
Parsa Mahmoudieh
Researcher
Aliya Rysbek
Researcher
Wei-Jen Ko
Researcher
Andrea Huber
Researcher
Brett Wiltshire
Researcher
Gal Elidan
Researcher
Roni Rabin
Researcher
Jasmin Rubinovitz
Researcher
Amit Pitaru
Researcher
Mac McAllister
Researcher
Julia Wilkowski
Researcher
David Choi
Researcher
Roee Engelberg
Researcher
Lidan Hackmon
Researcher
Adva Levin
Researcher
Rachel Griffin
Researcher
Michael Sears
Researcher
Filip Bar
Researcher
Mia Mesar
Researcher
Mana Jabbour
Researcher
Arslan Chaudhry
Researcher
James Cohan
Researcher
Sridhar Thiagarajan
Researcher
Nir Levine
Researcher
Ben Brown
Researcher
Dilan Gorur
Researcher
Svetlana Grant
Researcher
Rachel Hashimshoni
Researcher
Laura Weidinger
Researcher
Jieru Hu
Researcher
Dawn Chen
Researcher
Kuba Dolecki
Researcher
Canfer Akbulut
Researcher
Maxwell Bileschi
Researcher
Laura Culp
Researcher
Wen-Xin Dong
Researcher
Nahema Marchal
Researcher
Kelsie Van Deman
Researcher
Hema Bajaj Misra
Researcher
Michael Duah
Researcher
Moran Ambar
Researcher
Avi Caciularu
Researcher
Sandra Lefdal
Researcher
Chris Summerfield
Researcher
James An
Researcher
Pierre-Alexandre Kamienny
Researcher
Abhinit Mohdi
Researcher
Theofilos Strinopoulous
Researcher
Annie Hale
Researcher
Wayne Anderson
Researcher
Luis C. Cobo
Researcher
Niv Efron
Researcher
Muktha Ananda
Researcher
Shakir Mohamed
Researcher
Maureen Heymans
Researcher
Zoubin Ghahramani
Researcher
Yossi Matias
Researcher
Ben Gomes
Researcher
Lila Ibrahim
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Irina Jurenka, Markus Kunesch, Kevin R. McKee, Daniel Gillick, Shaojian Zhu, Sara Wiltberger, Shubham Milind Phal, Katherine Hermann, Daniel Kasenberg, Avishkar Bhoopchand, Ankit Anand, Miruna Pîslar, Stephanie Chan, Lisa Wang, Jennifer She, Parsa Mahmoudieh, Aliya Rysbek, Wei-Jen Ko, Andrea Huber, Brett Wiltshire, Gal Elidan, Roni Rabin, Jasmin Rubinovitz, Amit Pitaru, Mac McAllister, Julia Wilkowski, David Choi, Roee Engelberg, Lidan Hackmon, Adva Levin, Rachel Griffin, Michael Sears, Filip Bar, Mia Mesar, Mana Jabbour, Arslan Chaudhry, James Cohan, Sridhar Thiagarajan, Nir Levine, Ben Brown, Dilan Gorur, Svetlana Grant, Rachel Hashimshoni, Laura Weidinger, Jieru Hu, Dawn Chen, Kuba Dolecki, Canfer Akbulut, Maxwell Bileschi, Laura Culp, Wen-Xin Dong, Nahema Marchal, Kelsie Van Deman, Hema Bajaj Misra, Michael Duah, Moran Ambar, Avi Caciularu, Sandra Lefdal, Chris Summerfield, James An, Pierre-Alexandre Kamienny, Abhinit Mohdi, Theofilos Strinopoulous, Annie Hale, Wayne Anderson, Luis C. Cobo, Niv Efron, Muktha Ananda, Shakir Mohamed, Maureen Heymans, Zoubin Ghahramani, Yossi Matias, Ben Gomes, Lila Ibrahim
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai