Context Matters: A Strategy to Pre-train Language Model for Science Education
Project Overview
The document examines the application of generative AI, specifically through the domain-specific pre-training of language models like BERT and SciEdBERT, to improve the automatic scoring of student responses in science education. By leveraging in-domain training data, which includes students' written work and pertinent journal articles, the study reveals enhanced accuracy in assessing scientific argumentation and writing. These findings highlight the significance of tailoring language models to resonate with the specific language and context used by students, thereby leading to more effective assessment mechanisms in educational settings. The research indicates that contextualization of AI tools can significantly bolster educational assessment practices, ultimately fostering better learning outcomes in science disciplines.
Key Applications
SciEdBERT for automatic scoring of student written responses
Context: Science education, targeting middle school students
Implementation: Pre-trained BERT models contextualized with science education data, including student responses and journal articles, to improve scoring accuracy.
Outcomes: Improved accuracy in scoring scientific argumentation tasks, with reports showing higher performance compared to traditional models.
Challenges: Differences in language style between student writing and academic publications can introduce confusion in model training.
Implementation Barriers
Technical Barrier
The performance of NLP models can be negatively impacted if they are trained on a broad corpus that dilutes domain-specific data.
Proposed Solutions: Utilize a pyramid-shaped training scheme that prioritizes in-domain data to maximize its effect on model performance.
Project Team
Zhengliang Liu
Researcher
Xinyu He
Researcher
Lei Liu
Researcher
Tianming Liu
Researcher
Xiaoming Zhai
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Zhengliang Liu, Xinyu He, Lei Liu, Tianming Liu, Xiaoming Zhai
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai