Skip to main content Skip to navigation

Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach

Project Overview

The document explores the role of generative AI, particularly through the development of LearnLM-Tutor, a conversational AI designed to provide personalized educational support. It underscores the necessity of translating pedagogical intuitions into effective AI prompts and developing robust evaluation practices to optimize AI for educational contexts, while also addressing the mixed reception of generative AI in the field. Key applications of AI tutors include offering learners a safe space for inquiry and motivation, yet challenges such as biased outputs, privacy concerns, and the potential erosion of learner autonomy are significant risks that must be managed. The document outlines policies for the responsible development of AI tutors, emphasizing the importance of safety evaluations and continuous refinement to ensure that AI-generated content is pedagogically sound. It also details an evaluation-driven approach to assessing tutor quality through rating experiments, focusing on the significance of clear feedback, engagement, and adaptability to meet diverse student needs. The findings suggest that while generative AI holds promise for enhancing educational experiences, responsible implementation, collaboration with educators, and ongoing evaluation are crucial for maximizing its positive impact while minimizing potential harms.

Key Applications

LearnLM-Tutor

Context: AI tutoring for learners in various educational settings, including programming courses at Arizona State University, academic support, and interactions with educational videos. The AI engages learners through conversational methods, providing support tailored to their specific needs and learning contexts.

Implementation: LearnLM-Tutor was developed through participatory methods, incorporating supervised fine-tuning with pedagogically informed data. The development process involved user feedback and safety evaluations to enhance its effectiveness and adaptability in educational scenarios.

Outcomes: The implementation has led to improved learner engagement, satisfaction, and pedagogical effectiveness, showing higher ratings on clarity, engagement, and adaptability in AI tutor interactions. There is a reported fostering of a supportive learning environment.

Challenges: Challenges faced include mitigating risks of biased outputs, preventing premature direct answers, managing potential emotional dependency on the AI, and accurately assessing the complexity of human-like pedagogical qualities.

Implementation Barriers

Technical Barrier

Difficulty in verbalizing pedagogical intuitions into effective AI prompts and accurately simulating human-like pedagogical interactions.

Proposed Solutions: Developing a comprehensive set of pedagogical benchmarks, collaborating with educators to inform AI development, utilizing comprehensive evaluation criteria, and making iterative improvements based on learner feedback.

Evaluation Barrier

Lack of established evaluation practices for assessing AI in educational contexts.

Proposed Solutions: Creating a suite of pedagogical benchmarks that include both quantitative and qualitative measures to assess AI tutors.

Access Barrier

Equitable access to quality education and AI tools remains a significant challenge.

Proposed Solutions: Working towards democratizing access to AI technology and ensuring it meets the needs of diverse learners.

Technical

The AI tutor may produce biased or incorrect outputs, leading to misinformation.

Proposed Solutions: Implementing safety fine-tuning, regular evaluations against bias benchmarks, and curating high-quality training data.

Privacy and Ethical

Concerns over data privacy and potential surveillance of student interactions.

Proposed Solutions: Establishing strict data usage policies, ensuring anonymization of user data, and conducting transparency audits.

Pedagogical

Difficulty in maintaining effective pedagogical practices in AI responses, such as promoting critical thinking.

Proposed Solutions: Incorporating feedback from educators and continuous adjustments to the AI's pedagogical strategies.

Engagement Barrier

Learner engagement can be inconsistent, affecting the quality of interactions.

Proposed Solutions: Implementing strategies to promote active learning and adapt to the student's emotional state.

Project Team

Irina Jurenka

Researcher

Markus Kunesch

Researcher

Kevin R. McKee

Researcher

Daniel Gillick

Researcher

Shaojian Zhu

Researcher

Sara Wiltberger

Researcher

Shubham Milind Phal

Researcher

Katherine Hermann

Researcher

Daniel Kasenberg

Researcher

Avishkar Bhoopchand

Researcher

Ankit Anand

Researcher

Miruna Pîslar

Researcher

Stephanie Chan

Researcher

Lisa Wang

Researcher

Jennifer She

Researcher

Parsa Mahmoudieh

Researcher

Aliya Rysbek

Researcher

Wei-Jen Ko

Researcher

Andrea Huber

Researcher

Brett Wiltshire

Researcher

Gal Elidan

Researcher

Roni Rabin

Researcher

Jasmin Rubinovitz

Researcher

Amit Pitaru

Researcher

Mac McAllister

Researcher

Julia Wilkowski

Researcher

David Choi

Researcher

Roee Engelberg

Researcher

Lidan Hackmon

Researcher

Adva Levin

Researcher

Rachel Griffin

Researcher

Michael Sears

Researcher

Filip Bar

Researcher

Mia Mesar

Researcher

Mana Jabbour

Researcher

Arslan Chaudhry

Researcher

James Cohan

Researcher

Sridhar Thiagarajan

Researcher

Nir Levine

Researcher

Ben Brown

Researcher

Dilan Gorur

Researcher

Svetlana Grant

Researcher

Rachel Hashimshoni

Researcher

Laura Weidinger

Researcher

Jieru Hu

Researcher

Dawn Chen

Researcher

Kuba Dolecki

Researcher

Canfer Akbulut

Researcher

Maxwell Bileschi

Researcher

Laura Culp

Researcher

Wen-Xin Dong

Researcher

Nahema Marchal

Researcher

Kelsie Van Deman

Researcher

Hema Bajaj Misra

Researcher

Michael Duah

Researcher

Moran Ambar

Researcher

Avi Caciularu

Researcher

Sandra Lefdal

Researcher

Chris Summerfield

Researcher

James An

Researcher

Pierre-Alexandre Kamienny

Researcher

Abhinit Mohdi

Researcher

Theofilos Strinopoulous

Researcher

Annie Hale

Researcher

Wayne Anderson

Researcher

Luis C. Cobo

Researcher

Niv Efron

Researcher

Muktha Ananda

Researcher

Shakir Mohamed

Researcher

Maureen Heymans

Researcher

Zoubin Ghahramani

Researcher

Yossi Matias

Researcher

Ben Gomes

Researcher

Lila Ibrahim

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Irina Jurenka, Markus Kunesch, Kevin R. McKee, Daniel Gillick, Shaojian Zhu, Sara Wiltberger, Shubham Milind Phal, Katherine Hermann, Daniel Kasenberg, Avishkar Bhoopchand, Ankit Anand, Miruna Pîslar, Stephanie Chan, Lisa Wang, Jennifer She, Parsa Mahmoudieh, Aliya Rysbek, Wei-Jen Ko, Andrea Huber, Brett Wiltshire, Gal Elidan, Roni Rabin, Jasmin Rubinovitz, Amit Pitaru, Mac McAllister, Julia Wilkowski, David Choi, Roee Engelberg, Lidan Hackmon, Adva Levin, Rachel Griffin, Michael Sears, Filip Bar, Mia Mesar, Mana Jabbour, Arslan Chaudhry, James Cohan, Sridhar Thiagarajan, Nir Levine, Ben Brown, Dilan Gorur, Svetlana Grant, Rachel Hashimshoni, Laura Weidinger, Jieru Hu, Dawn Chen, Kuba Dolecki, Canfer Akbulut, Maxwell Bileschi, Laura Culp, Wen-Xin Dong, Nahema Marchal, Kelsie Van Deman, Hema Bajaj Misra, Michael Duah, Moran Ambar, Avi Caciularu, Sandra Lefdal, Chris Summerfield, James An, Pierre-Alexandre Kamienny, Abhinit Mohdi, Theofilos Strinopoulous, Annie Hale, Wayne Anderson, Luis C. Cobo, Niv Efron, Muktha Ananda, Shakir Mohamed, Maureen Heymans, Zoubin Ghahramani, Yossi Matias, Ben Gomes, Lila Ibrahim

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies