Skip to main content Skip to navigation

Auto-survey Challenge

Project Overview

The document outlines a competition aimed at assessing the capabilities of Large Language Models (LLMs) in the field of education, specifically their ability to autonomously generate and critique academic survey papers across diverse disciplines. This initiative comprises two primary tasks: AI-Author, where LLMs create survey papers from provided prompts, and AI-Reviewer, where they evaluate and critique these generated texts. A structured evaluation framework is employed, focusing on criteria such as relevance, contribution, soundness, clarity, and responsibility. The findings reveal that models like ChatGPT demonstrate significant potential in generating academic content, showcasing their ability to assist in educational contexts. However, the results also identify specific areas where improvements are needed, underscoring the ongoing challenges in ensuring the reliability and quality of AI-generated academic texts. Overall, the competition highlights the promising applications of generative AI in education, while also emphasizing the necessity for continuous refinement and evaluation of these technologies to maximize their effectiveness and applicability in academic settings.

Key Applications

AI-Author and AI-Reviewer tasks for generating and evaluating academic survey papers

Context: Academic research and conference settings, targeting participants and researchers in AI and education

Implementation: Participants create models to generate survey papers and evaluate them using predefined criteria with human oversight.

Outcomes: Demonstrated ability of LLMs to produce coherent and relevant academic content, achieving significant differences in evaluation scores compared to a dummy baseline.

Challenges: Need for improvement in the accuracy of citations and the quality of contributions in generated papers.

Implementation Barriers

Technical barrier

Challenges in ensuring the factual accuracy and relevance of AI-generated citations and contributions.

Proposed Solutions: Further development and refinement of evaluation criteria and methodologies to enhance the quality of generated content.

Project Team

Thanh Gia Hieu Khuong

Researcher

Benedictus Kent Rachmat

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Thanh Gia Hieu Khuong, Benedictus Kent Rachmat

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies