Skip to main content Skip to navigation

TutorialBank: A Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation

Project Overview

The document highlights the use of generative AI in education through the introduction of TutorialBank, a comprehensive dataset aimed at enhancing research and learning in Natural Language Processing (NLP) and related fields. With over 6,300 curated resources, including tutorials and surveys, TutorialBank serves as a valuable tool for both students and educators, facilitating access to the rapidly evolving landscape of AI and Machine Learning. Key applications of this dataset include a sophisticated search engine and command-line tools that recommend resources tailored to users' needs, along with the ability to generate reading lists based on prerequisite knowledge for various NLP topics. This initiative not only streamlines the educational process but also promotes deeper understanding and engagement with complex subjects, ultimately fostering a more informed and capable community in the realm of AI. Through these features, TutorialBank exemplifies the potential of generative AI to transform educational practices and enhance knowledge acquisition in advanced technological domains.

Key Applications

TutorialBank dataset and search engine

Context: Educational context for students and educators in NLP and AI

Implementation: A manually-collected corpus categorized and annotated for educational purposes, complemented by a search engine and resource recommendation tools.

Outcomes: Facilitates learning and research by providing access to relevant educational resources, improving understanding of NLP concepts, and assisting in curriculum planning.

Challenges: Quality control in resource selection, updating the dataset with the latest research, and ensuring the relevance of the resources.

Implementation Barriers

Quality Control

Ensuring that the collected resources meet high educational standards can be subjective and challenging.

Proposed Solutions: Manual curation and selection by experts to ensure quality and relevance.

Resource Updating

The rapid evolution of NLP and related fields can make existing resources outdated.

Proposed Solutions: Continuous collection and updating of resources to keep the dataset current.

Project Team

Alexander R. Fabbri

Researcher

Irene Li

Researcher

Prawat Trairatvorakul

Researcher

Yijiao He

Researcher

Wei Tai Ting

Researcher

Robert Tung

Researcher

Caitlin Westerfield

Researcher

Dragomir R. Radev

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Alexander R. Fabbri, Irene Li, Prawat Trairatvorakul, Yijiao He, Wei Tai Ting, Robert Tung, Caitlin Westerfield, Dragomir R. Radev

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies