Skip to main content Skip to navigation

Pre-Training With Scientific Text Improves Educational Question Generation

Project Overview

The document explores the implementation of EduQG, an innovative educational question generation model leveraging a large language model trained on scientific texts to facilitate the automatic creation of educational questions. In response to the growing need for personalized learning and scalable self-assessment tools in digital education, EduQG seeks to improve the quality of generated questions when compared to existing models. Preliminary experiments suggest that further pre-training on domain-specific datasets, particularly scientific literature, can significantly enhance the model's effectiveness in educational applications, although issues related to linguistic quality persist. Overall, the findings highlight the potential of generative AI to transform educational practices by providing tailored assessment tools that can better meet the needs of learners.

Key Applications

EduQG

Context: Educational question generation for self-assessment in personalized learning environments.

Implementation: Pre-training a large language model (T5) on scientific texts (S2ORC dataset) and evaluating it against existing models (Leaf) using the SciQ dataset.

Outcomes: EduQG shows improved predictive performance for generating scientific questions compared to the Leaf baseline model.

Challenges: Challenges with linguistic quality and alignment of scientific language style with question generation models.

Implementation Barriers

Technical barrier

The linguistic quality metrics of the generated questions did not meet expectations, indicating potential issues with the alignment of scientific language and the reference models used for assessment.

Proposed Solutions: Future work will involve deeper analyses and possibly human studies to address the mismatch between language models and the generated content.

Project Team

Hamze Muse

Researcher

Sahan Bulathwela

Researcher

Emine Yilmaz

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Hamze Muse, Sahan Bulathwela, Emine Yilmaz

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18