G-SciEdBERT: A Contextualized LLM for Science Assessment Tasks in German
Project Overview
The document discusses the implementation of G-SciEdBERT, a specialized large language model tailored for scoring German-written science responses in educational contexts. It underscores the progress in automated scoring systems achieved through contextualized language models, focusing on the pre-training and fine-tuning techniques that have markedly improved scoring accuracy compared to the general-purpose G-BERT model. The findings illustrate G-SciEdBERT's effectiveness in comprehending and evaluating intricate scientific texts, showcasing its potential to enhance educational assessments significantly. Overall, the use of generative AI in education, exemplified by G-SciEdBERT, represents a transformative advancement in the accuracy and reliability of evaluating student responses, offering promising outcomes for future educational practices.
Key Applications
G-SciEdBERT: A contextualized large language model for scoring German-written science responses
Context: Used in science education for assessing written responses of secondary students in Germany, particularly those participating in the PISA assessments.
Implementation: Pre-trained on a corpus of 30,000 German-written science responses, and fine-tuned on an additional 20,000 responses to enhance scoring accuracy.
Outcomes: Achieved a 10.2% increase in scoring accuracy compared to G-BERT, demonstrating improved performance in understanding scientific language and context.
Challenges: The complexity of scientific language and the need for domain-specific knowledge posed challenges for general-purpose language models like G-BERT.
Implementation Barriers
Technical barrier
General-purpose language models lack the ability to accurately assess domain-specific responses due to their training on broader datasets.
Proposed Solutions: Developing domain-specific large language models such as G-SciEdBERT that are pre-trained on relevant educational and scientific data.
Project Team
Ehsan Latif
Researcher
Gyeong-Geon Lee
Researcher
Knut Neumann
Researcher
Tamara Kastorff
Researcher
Xiaoming Zhai
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Ehsan Latif, Gyeong-Geon Lee, Knut Neumann, Tamara Kastorff, Xiaoming Zhai
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai