Skip to main content Skip to navigation

RENOVI: A Benchmark Towards Remediating Norm Violations in Socio-Cultural Conversations

Project Overview

The document discusses the innovative use of generative AI in education through the introduction of RENOVI, a benchmark dataset aimed at addressing norm violations in socio-cultural conversations. It underscores the significance of social norms in effective communication and showcases how large language models (LLMs) such as ChatGPT can produce synthetic dialogue data to enrich training datasets. The findings reveal that integrating synthetic data with human-authored content significantly enhances the ability to detect and correct norm violations, thereby ensuring that AI systems are better aligned with human interpretations of social norms. This approach not only improves the educational tools that utilize AI but also promotes a deeper understanding of socio-cultural dynamics, ultimately leading to more effective communication strategies in learning environments.

Key Applications

RENOVI dataset

Context: The dataset is used for training and evaluating AI systems on the remediation of norm violations in socio-cultural conversations, particularly within the context of Chinese culture.

Implementation: The dataset was created by collecting 512 human-authored dialogues and generating 8,746 synthetic dialogues using ChatGPT, following strict quality control protocols.

Outcomes: The dataset enables AI models to better understand and remediate norm violations, improving their performance in socio-cultural dialogue scenarios.

Challenges: Challenges include ensuring the quality of synthetic data and the alignment between AI-generated responses and human understanding of social norms.

Implementation Barriers

Data Quality

Ensuring the quality and cultural relevance of synthetic data generated by AI models can be challenging.

Proposed Solutions: Implementing strict quality control measures during the data collection process and validating AI-generated outputs against human responses.

Cultural Sensitivity

AI models may struggle to appropriately handle cultural nuances and social norms in dialogues.

Proposed Solutions: Training AI systems on diverse datasets that include various cultural contexts and continuously updating models based on feedback.

Project Team

Haolan Zhan

Researcher

Zhuang Li

Researcher

Xiaoxi Kang

Researcher

Tao Feng

Researcher

Yuncheng Hua

Researcher

Lizhen Qu

Researcher

Yi Ying

Researcher

Mei Rianto Chandra

Researcher

Kelly Rosalin

Researcher

Jureynolds Jureynolds

Researcher

Suraj Sharma

Researcher

Shilin Qu

Researcher

Linhao Luo

Researcher

Lay-Ki Soon

Researcher

Zhaleh Semnani Azad

Researcher

Ingrid Zukerman

Researcher

Gholamreza Haffari

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Haolan Zhan, Zhuang Li, Xiaoxi Kang, Tao Feng, Yuncheng Hua, Lizhen Qu, Yi Ying, Mei Rianto Chandra, Kelly Rosalin, Jureynolds Jureynolds, Suraj Sharma, Shilin Qu, Linhao Luo, Lay-Ki Soon, Zhaleh Semnani Azad, Ingrid Zukerman, Gholamreza Haffari

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies