Skip to main content Skip to navigation

Exploring the Capabilities of Prompted Large Language Models in Educational and Assessment Applications

Project Overview

The document examines the role of generative AI, particularly prompted large language models (LLMs), in enhancing educational practices and assessment strategies. It emphasizes the innovative applications of LLMs, such as generating open-ended questions from textbooks, evaluating human resource interview transcripts, and correcting grammatical errors in underrepresented languages like Bengali. The research assesses the performance of these AI models against human experts, revealing their effectiveness in specific tasks while also highlighting the challenges and limitations encountered during their integration into educational contexts. Overall, the findings suggest that while generative AI has significant potential to transform educational practices, careful consideration of its implementation challenges is essential for maximizing its benefits.

Key Applications

Generating open-ended questions and multiple-choice questions (MCQs) from textbooks

Context: Educational context for both school-level and undergraduate students across various technical and non-technical subjects, including multiple languages like Bengali, Hindi, German, and English.

Implementation: Utilizing prompt-based techniques and multi-stage prompting strategies to generate open-ended questions and MCQs from educational texts (such as NCERT and technical textbooks). This includes curating specialized datasets for effective question generation, leveraging models like T5 LARGE and GPT-based architectures.

Outcomes: Improved quality and diversity of generated questions, with T5 LARGE outperforming other models in automated evaluation metrics, although both still fall short of human baseline performance. Enhanced distractor generation and question quality have been noted in multiple languages.

Challenges: Inadequate existing QA datasets for educational settings, difficulty in matching human expertise, and the need for further refinement and fine-tuning for low-resource languages.

Assessing and providing feedback on interview transcripts and grammatical errors

Context: Language learning and HR interview preparation contexts for candidates, particularly L2 English speakers and students learning languages like Bengali.

Implementation: Creating specialized datasets (such as HURIT) for evaluating LLMs' performance in assessing HR interview transcripts and providing grammatical error explanations. This includes investigating LLM capabilities in providing detailed feedback and error identification.

Outcomes: LLMs demonstrated competence in scoring interview transcripts but struggled with error identification and feedback provision. Identified shortcomings in providing detailed grammatical explanations.

Challenges: Limited real-world datasets for specific contexts, need for human oversight to ensure quality feedback, and lack of comprehensive feedback mechanisms.

Implementation Barriers

Data availability and language resource disparity

Inadequate existing QA datasets for educational contexts, especially for prompt-based question generation. Low-resource languages like Bengali lack sufficient datasets for training effective LLMs.

Proposed Solutions: Curating new datasets tailored for educational purposes, such as EduProbe for school-level subjects, and developing authentic datasets for low-resource languages while integrating manual checks for grammatical error correction.

Model limitations

LLMs often fall short of human expertise in generating high-quality, contextually relevant questions.

Proposed Solutions: Further research and refinement of LLMs to improve their performance over time.

Human oversight requirement

LLMs struggle with error identification and providing actionable feedback in assessments.

Proposed Solutions: Adopting a human-in-the-loop approach to supplement LLM evaluations with human expertise.

Project Team

Subhankar Maity

Researcher

Aniket Deroy

Researcher

Sudeshna Sarkar

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Subhankar Maity, Aniket Deroy, Sudeshna Sarkar

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies