From Model to Classroom: Evaluating Generated MCQs for Portuguese with Narrative and Difficulty Concerns
Project Overview
The document explores the application of generative AI in education, specifically in creating Multiple-Choice Questions (MCQs) for reading comprehension among Portuguese elementary students. It underscores the technology's potential to automate MCQ generation, which can alleviate educators' workloads and improve assessment scalability. The study compares the quality of AI-generated MCQs to those created by humans, utilizing expert reviews and psychometric analyses. Findings indicate that while AI can produce questions of similar quality to human-authored ones, challenges persist regarding semantic clarity and the creation of effective distractors. Additionally, the document emphasizes the development of automated question generation tools tailored for educational contexts, particularly in less-resourced languages, highlighting the necessity of aligning questions with narrative elements to boost engagement and educational efficacy. Overall, the findings suggest a promising future for generative AI in enhancing educational assessments and supporting diverse language needs, while also pointing out areas for improvement in AI-generated content.
Key Applications
Automated educational question generation using large language models
Context: Educational settings aimed at children, specifically focusing on narrative comprehension and reading comprehension in Portuguese language. The implementation includes generating multiple-choice questions based on narrative elements and reading passages.
Implementation: Utilizing large language models such as GPT-4o and Gemma-2 to generate multiple-choice questions (MCQs) using both one-step and two-step methods. These methods involve creating questions that assess narrative comprehension and reading comprehension capabilities.
Outcomes: Generated MCQs showed comparable quality to human-authored ones, enhancing engagement and understanding of narrative comprehension among young learners. The questions demonstrated a good level of well-formedness and narrative alignment.
Challenges: Issues related to semantic clarity, answerability, generating engaging distractors, and ensuring the questions are relevant and appropriately challenging for the target age group.
Implementation Barriers
Technical Barrier
The reliability of AI-generated MCQs is questioned, especially in high-stakes environments where teachers may hesitate to use imperfect AI models. Additionally, challenges related to the quality and appropriateness of generated questions for different skill levels exist.
Proposed Solutions: Improving the validation processes for generated questions, enhancing models to minimize errors, and implementing evaluation metrics and expert reviews to assess the quality of generated questions.
Language Barrier
Most studies on MCQ generation focus on English, leaving other languages, like Portuguese, less explored. Generating content in less-resourced languages can lead to issues with fluency and cultural relevance.
Proposed Solutions: Developing specific AI models and datasets for underrepresented languages to enhance the diversity of generated content, and pretraining models on local language datasets to improve fluency and contextual understanding.
Educational Barrier
Difficulty in aligning generated questions with educational standards and curriculum needs.
Proposed Solutions: Collaborating with educators during the design process to ensure alignment with pedagogical goals.
Project Team
Bernardo Leite
Researcher
Henrique Lopes Cardoso
Researcher
Pedro Pinto
Researcher
Abel Ferreira
Researcher
Luís Abreu
Researcher
Isabel Rangel
Researcher
Sandra Monteiro
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Bernardo Leite, Henrique Lopes Cardoso, Pedro Pinto, Abel Ferreira, Luís Abreu, Isabel Rangel, Sandra Monteiro
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai