Skip to main content Skip to navigation

Quality-Diversity through AI Feedback

Project Overview

The document examines the application of generative AI in education, particularly through the implementation of Quality-Diversity through AI Feedback (QDAIF), which utilizes advanced language models to enhance creative writing. QDAIF automates the ideation process by generating diverse, high-quality outputs while providing evaluative feedback, outperforming traditional methods in producing creative texts like stories and poems. The effectiveness of generative AI in enhancing diversity in writing is emphasized, alongside the challenges of defining diversity and the impact of model selection on performance. Additionally, the document explores narrative generation across various themes and target audiences, demonstrating the AI's capability to create narratives from different perspectives and its adaptability to appeal to both adults and children. It outlines the iterative process of narrative development, noting that while early versions often contain errors and inconsistencies, later iterations improve in fidelity and character inclusion, though challenges like repetitive phrases and basic errors remain. Overall, the findings highlight the potential of generative AI to enrich educational practices in creative writing, despite ongoing challenges in narrative depth and complexity.

Key Applications

Quality-Diversity AI Framework for Creative Text Generation

Context: Educational settings focusing on creative writing, including narrative generation, poetry, and storytelling. Target audiences include students and educators in literature and creative arts, as well as those exploring AI's role in creative writing.

Implementation: The AI utilizes a Quality-Diversity approach to generate a wide variety of creative texts, including narratives, poetry, and stories. It employs evolutionary algorithms and language models to create diverse and high-quality solutions through guided rewriting and thematic exploration, iterating through various examples and seed texts to enhance narrative quality and complexity.

Outcomes: Significant improvements in the quality and diversity of generated texts, with clear thematic differences and adaptability to various styles. The AI exhibits higher narrative complexity and character inclusion, achieving high QD-scores in many categories, while providing insights into the creative writing process.

Challenges: Challenges include defining precise diversity metrics, managing the balance between quality and diversity, capturing the essence of historical themes, and avoiding convergence on repetitive or low-quality patterns. Additionally, some iterations may produce outputs too similar to seed examples or contain errors in character development and storyline.

Quality-Diversity AI Framework for Code Generation

Context: Programming tasks across educational settings targeting software developers and educators in computer science, focusing on algorithm implementation and coding challenges.

Implementation: The QDAIF method is employed to evolve code solutions for programming challenges by leveraging AI feedback. It explores diverse implementations of algorithms, demonstrating higher diversity in generated code compared to baseline methods while maintaining readability and efficiency.

Outcomes: Enhanced diversity in the types of coding solutions generated, particularly in algorithmic contexts, indicating the effectiveness of generative AI in fostering innovation and creativity in coding.

Challenges: Key challenges include ensuring the generated code maintains quality standards of readability and efficiency while avoiding over-reliance on predefined functions.

Implementation Barriers

Technological

The implementation of QDAIF relies on advanced language models and the ability to define diversity metrics accurately. The complexity of defining appropriate diversity measures and axes for creative outputs can lead to issues in generating desired results.

Proposed Solutions: Utilize reinforcement learning from human feedback (RLHF) to improve model evaluations, explore diverse metrics through LMs, and automate the generation of diversity axes through AI prompts. Experiment with different binning strategies to capture a wider range of outputs.

Subjective Evaluation

The evaluation of creativity and quality can be highly subjective, leading to potential misalignments between AI and human assessments.

Proposed Solutions: Incorporate multiple AI models for evaluations to mitigate biases and ensure robustness in quality assessments.

Quality Control Barrier

Maintaining a balance between diversity and the quality of generated texts is difficult, often leading to suboptimal outputs. Inconsistency in narrative quality and relevance to the intended themes can also hinder effectiveness.

Proposed Solutions: Implement quality filters in the AI feedback process to ensure only high-quality texts are retained in the generation pool, and utilize more stringent evaluation metrics and human feedback loops to refine AI outputs.

Technical

Difficulty in generating narratives that effectively capture nuanced themes, especially historical contexts. Generated texts sometimes contain errors and are repetitive, which can hinder the quality and originality of the narratives.

Proposed Solutions: Increased training on diverse historical narratives to improve the AI's understanding and generation capabilities, and refining the AI training process to better understand narrative structures and improve diversity in generated outputs.

Project Team

Herbie Bradley

Researcher

Andrew Dai

Researcher

Hannah Teufel

Researcher

Jenny Zhang

Researcher

Koen Oostermeijer

Researcher

Marco Bellagente

Researcher

Jeff Clune

Researcher

Kenneth Stanley

Researcher

Grégory Schott

Researcher

Joel Lehman

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Herbie Bradley, Andrew Dai, Hannah Teufel, Jenny Zhang, Koen Oostermeijer, Marco Bellagente, Jeff Clune, Kenneth Stanley, Grégory Schott, Joel Lehman

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies