Skip to main content Skip to navigation

Fine-Tuned LLMs are "Time Capsules" for Tracking Societal Bias Through Books

Project Overview

The document explores the transformative role of generative AI in education, emphasizing its potential to enhance personalized learning and streamline content generation while addressing the inherent challenges of implementation, such as biases and ethical concerns. A significant part of the research focuses on a novel methodology that utilizes fine-tuned Large Language Models (LLMs) to analyze societal biases within a decade-stratified corpus of fictional literature. This analysis reveals how LLMs can effectively capture and reflect historical biases related to gender, sexual orientation, race, and religion, showcasing their evolution over seven decades from 1950 to 2019. The findings underscore the necessity for diverse and representative training data in AI development, aiming to integrate insights from AI, literary studies, and social science to foster a more equitable approach in educational settings. Overall, the document presents a comprehensive view of generative AI's capabilities and implications in education, advocating for careful consideration of ethical dimensions while harnessing its innovative potential to improve learning experiences.

Key Applications

AI-driven Content Generation and Personalized Learning Platforms

Context: Applicable in K-12 and higher education settings, targeting students and educators across diverse learning environments. This includes researchers and instructional designers who utilize AI for developing educational materials like quizzes, lesson plans, and personalized learning resources.

Implementation: Integration of AI tools and fine-tuning of language models to assist in creating educational content and tailoring learning experiences to individual student needs. This includes analyzing biases in literature and generating instructional materials through structured prompts.

Outcomes: Improved engagement and learning outcomes, increased efficiency in content development, enhanced resource availability, and a deeper understanding of societal biases through literature analysis.

Challenges: Concerns regarding data privacy, quality control of AI-generated content, potential reliability issues, and the need for teacher training to effectively implement these AI tools.

Implementation Barriers

Data limitations and Generalizability issues

The corpus is limited to fictional bestsellers, which may not represent broader societal attitudes. Findings are primarily based on English-language books from Western contexts, limiting applicability.

Proposed Solutions: Future work could include a more diverse range of literary sources beyond bestsellers, extending research to include non-fiction literature and cross-cultural comparisons.

Methodological challenges and Ethical Barrier

Fine-tuning may introduce biases beyond those present in the original texts, raising concerns regarding bias in AI algorithms affecting fairness in education.

Proposed Solutions: Careful crafting of prompts, consideration of diverse perspectives in the training data, and regular audits of AI systems with inclusive dataset curation.

Technical Barrier

Integration of AI tools with existing educational infrastructure.

Proposed Solutions: Invest in robust IT support and staff training programs.

Project Team

Sangmitra Madhusudan

Researcher

Robert Morabito

Researcher

Skye Reid

Researcher

Nikta Gohari Sadr

Researcher

Ali Emami

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Sangmitra Madhusudan, Robert Morabito, Skye Reid, Nikta Gohari Sadr, Ali Emami

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies