Skip to main content Skip to navigation

Re$^2$: A Consistency-ensured Dataset for Full-stage Peer Review and Multi-turn Rebuttal Discussions

Project Overview

The document explores the role of generative AI in education, emphasizing its applications in enhancing the peer review process within academia, particularly in the fields of AI and computer science. It identifies the shortcomings of current peer review datasets and introduces the Re2 dataset as a robust alternative that not only encompasses initial submissions but also features a multi-turn conversation format for rebuttals. This innovative approach aims to improve the quality of feedback generated by Large Language Models (LLMs), enabling them to provide more constructive insights for authors and reviewers. By leveraging such datasets, the document suggests that generative AI can significantly refine academic peer review practices, foster better communication, and ultimately enhance the quality of scholarly work. The findings indicate that the integration of these advanced AI tools can lead to a more efficient and effective review process, contributing to the overall advancement of knowledge in the academic community.

Key Applications

Re2 dataset for peer review and rebuttal discussions

Context: Academic peer review process, targeting researchers and authors in AI and computer science fields.

Implementation: The Re2 dataset was created by crawling publicly accessible papers and their review records from OpenReview, ensuring data consistency by using only initial submissions.

Outcomes: Improved ability of LLMs to assist in peer review, enhancing feedback quality and reducing author resubmission rates.

Challenges: Limited data diversity and quality from existing datasets; challenges in standardizing review formats across different conferences.

Implementation Barriers

Data Quality

Existing peer review datasets are often based on revised submissions rather than initial ones, leading to inconsistencies.

Proposed Solutions: The Re2 dataset ensures that all data consists of initial submissions, improving data reliability.

Data Diversity

Current datasets often lack diversity in data sources, limiting their usefulness for training models.

Proposed Solutions: Re2 includes data from 24 conferences and 21 workshops to enhance diversity.

Complexity of Rebuttal Processes

Many existing datasets do not effectively capture the rebuttal and discussion stages of peer review.

Proposed Solutions: Re2 treats rebuttals as multi-turn conversations, aiming to provide a more realistic training environment for LLMs.

Project Team

Daoze Zhang

Researcher

Zhijian Bao

Researcher

Sihang Du

Researcher

Zhiyi Zhao

Researcher

Kuangling Zhang

Researcher

Dezheng Bao

Researcher

Yang Yang

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Daoze Zhang, Zhijian Bao, Sihang Du, Zhiyi Zhao, Kuangling Zhang, Dezheng Bao, Yang Yang

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies