Skip to main content Skip to navigation

Building a Domain-specific Guardrail Model in Production

Project Overview

The document explores the integration of generative AI in education, particularly through the development of a domain-specific guardrail model tailored for K-12 applications. It emphasizes the necessity of ensuring that AI-generated content is safe and suitable for educational purposes, given the strict standards required in this context. Key challenges identified include adherence to regulations such as FERPA, the need for low latency to facilitate real-time classroom interactions, and the importance of maintaining interpretability of AI outputs. The paper details the SPADE system, which aims to enhance the performance of large language models (LLMs) while prioritizing safety and appropriateness in educational environments. Overall, the document illustrates how generative AI can be effectively harnessed in education, addressing critical concerns to foster a beneficial learning experience.

Key Applications

Domain-specific guardrail model for K-12 educational platform

Context: K-12 educational setting, targeting students and educators

Implementation: Developed a guardrail model trained on a dataset of appropriate and inappropriate queries for classroom interactions.

Outcomes: Model outperformed existing instruction-tuned models on safety benchmarks and demonstrated real-time safety in content generation.

Challenges: Ensuring compliance with regulations like FERPA and COPPA, maintaining content appropriateness, and achieving real-time performance.

Implementation Barriers

Regulatory Compliance

Educational AI must comply with data privacy regulations such as FERPA and COPPA.

Proposed Solutions: Establish clear guidelines and a customizable constitution for the AI model to align with local, state, and federal regulations.

Content Safety

Ensuring the generated content is safe and appropriate for K-12 students.

Proposed Solutions: Implement a safety and appropriateness framework that includes a comprehensive dataset of safe and unsafe topics.

Technical Limitations

High computational requirements for deploying domain-specific LLMs.

Proposed Solutions: Optimize model architecture, utilize efficient hardware, and streamline the inference process.

Project Team

Mohammad Niknazar

Researcher

Paul V Haley

Researcher

Latha Ramanan

Researcher

Sang T. Truong

Researcher

Yedendra Shrinivasan

Researcher

Ayan Kumar Bhowmick

Researcher

Prasenjit Dey

Researcher

Ashish Jagmohan

Researcher

Hema Maheshwari

Researcher

Shom Ponoth

Researcher

Robert Smith

Researcher

Aditya Vempaty

Researcher

Nick Haber

Researcher

Sanmi Koyejo

Researcher

Sharad Sundararajan

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Mohammad Niknazar, Paul V Haley, Latha Ramanan, Sang T. Truong, Yedendra Shrinivasan, Ayan Kumar Bhowmick, Prasenjit Dey, Ashish Jagmohan, Hema Maheshwari, Shom Ponoth, Robert Smith, Aditya Vempaty, Nick Haber, Sanmi Koyejo, Sharad Sundararajan

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies