RENOVI: A Benchmark Towards Remediating Norm Violations in Socio-Cultural Conversations
Project Overview
The document discusses the innovative use of generative AI in education through the introduction of RENOVI, a benchmark dataset aimed at addressing norm violations in socio-cultural conversations. It underscores the significance of social norms in effective communication and showcases how large language models (LLMs) such as ChatGPT can produce synthetic dialogue data to enrich training datasets. The findings reveal that integrating synthetic data with human-authored content significantly enhances the ability to detect and correct norm violations, thereby ensuring that AI systems are better aligned with human interpretations of social norms. This approach not only improves the educational tools that utilize AI but also promotes a deeper understanding of socio-cultural dynamics, ultimately leading to more effective communication strategies in learning environments.
Key Applications
RENOVI dataset
Context: The dataset is used for training and evaluating AI systems on the remediation of norm violations in socio-cultural conversations, particularly within the context of Chinese culture.
Implementation: The dataset was created by collecting 512 human-authored dialogues and generating 8,746 synthetic dialogues using ChatGPT, following strict quality control protocols.
Outcomes: The dataset enables AI models to better understand and remediate norm violations, improving their performance in socio-cultural dialogue scenarios.
Challenges: Challenges include ensuring the quality of synthetic data and the alignment between AI-generated responses and human understanding of social norms.
Implementation Barriers
Data Quality
Ensuring the quality and cultural relevance of synthetic data generated by AI models can be challenging.
Proposed Solutions: Implementing strict quality control measures during the data collection process and validating AI-generated outputs against human responses.
Cultural Sensitivity
AI models may struggle to appropriately handle cultural nuances and social norms in dialogues.
Proposed Solutions: Training AI systems on diverse datasets that include various cultural contexts and continuously updating models based on feedback.
Project Team
Haolan Zhan
Researcher
Zhuang Li
Researcher
Xiaoxi Kang
Researcher
Tao Feng
Researcher
Yuncheng Hua
Researcher
Lizhen Qu
Researcher
Yi Ying
Researcher
Mei Rianto Chandra
Researcher
Kelly Rosalin
Researcher
Jureynolds Jureynolds
Researcher
Suraj Sharma
Researcher
Shilin Qu
Researcher
Linhao Luo
Researcher
Lay-Ki Soon
Researcher
Zhaleh Semnani Azad
Researcher
Ingrid Zukerman
Researcher
Gholamreza Haffari
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Haolan Zhan, Zhuang Li, Xiaoxi Kang, Tao Feng, Yuncheng Hua, Lizhen Qu, Yi Ying, Mei Rianto Chandra, Kelly Rosalin, Jureynolds Jureynolds, Suraj Sharma, Shilin Qu, Linhao Luo, Lay-Ki Soon, Zhaleh Semnani Azad, Ingrid Zukerman, Gholamreza Haffari
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai