Pre-Training With Scientific Text Improves Educational Question Generation
Project Overview
The document explores the implementation of EduQG, an innovative educational question generation model leveraging a large language model trained on scientific texts to facilitate the automatic creation of educational questions. In response to the growing need for personalized learning and scalable self-assessment tools in digital education, EduQG seeks to improve the quality of generated questions when compared to existing models. Preliminary experiments suggest that further pre-training on domain-specific datasets, particularly scientific literature, can significantly enhance the model's effectiveness in educational applications, although issues related to linguistic quality persist. Overall, the findings highlight the potential of generative AI to transform educational practices by providing tailored assessment tools that can better meet the needs of learners.
Key Applications
EduQG
Context: Educational question generation for self-assessment in personalized learning environments.
Implementation: Pre-training a large language model (T5) on scientific texts (S2ORC dataset) and evaluating it against existing models (Leaf) using the SciQ dataset.
Outcomes: EduQG shows improved predictive performance for generating scientific questions compared to the Leaf baseline model.
Challenges: Challenges with linguistic quality and alignment of scientific language style with question generation models.
Implementation Barriers
Technical barrier
The linguistic quality metrics of the generated questions did not meet expectations, indicating potential issues with the alignment of scientific language and the reference models used for assessment.
Proposed Solutions: Future work will involve deeper analyses and possibly human studies to address the mismatch between language models and the generated content.
Project Team
Hamze Muse
Researcher
Sahan Bulathwela
Researcher
Emine Yilmaz
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Hamze Muse, Sahan Bulathwela, Emine Yilmaz
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18