Evaluating Vision-Language and Large Language Models for Automated Student Assessment in Indonesian Classrooms
Project Overview
This document explores the application of generative AI, specifically vision-language models (VLMs) and large language models (LLMs), in the context of automated student assessment in Indonesian primary schools, particularly in Mathematics and English. It highlights the unique challenges faced in low-tech and under-resourced educational environments, stressing the importance of socio-cultural relevance and the necessity for AI tools to be adapted to local contexts. The findings indicate that while the feedback generated by LLMs can be beneficial for enhancing student learning, significant obstacles such as handwriting recognition and the personalization of feedback persist. Additionally, the study underscores the critical need for AI applications that take into account local languages and address the educational disparities prevalent in these settings. Overall, the research emphasizes the potential of generative AI to improve educational outcomes, while also pointing out the essential considerations required to effectively implement such technologies in diverse and resource-limited environments.
Key Applications
Automated grading and feedback generation using VLMs and LLMs
Context: Primary school classrooms in Indonesia, particularly for grade 4 students in Mathematics and English subjects.
Implementation: The study collected handwritten responses from students, applied OCR using VLM for handwriting recognition, and used LLMs for grading and feedback generation.
Outcomes: The best performance for grading was achieved with GPT-4o, which produced accurate scores and relatively useful feedback. However, challenges in handwriting recognition and personalization of feedback were noted.
Challenges: Handwriting recognition errors, limited personalization of feedback, and context relevance issues due to reliance on English-centric AI models.
Implementation Barriers
Technical Barrier
VLMs struggle with accurately recognizing student handwriting, affecting grading accuracy. Improving OCR accuracy through better model training and integration with localized educational content is essential.
Proposed Solutions: Enhancing OCR technology and ensuring that it is tailored to recognize diverse handwriting styles.
Cultural Barrier
Existing models are primarily developed for English-speaking contexts, limiting their effectiveness in non-English-speaking regions. Adapting AI models to local languages and curricula is necessary to ensure socio-cultural relevance.
Proposed Solutions: Customizing AI systems to reflect local educational needs and facilitating inclusivity in educational resources.
Resource Barrier
Many schools, especially in rural areas, lack access to digital devices necessary for implementing AI solutions. Designing AI systems that function effectively in low-tech environments is crucial.
Proposed Solutions: Utilizing handwritten assessments and ensuring that AI tools can operate with minimal technological infrastructure.
Project Team
Nurul Aisyah
Researcher
Muhammad Dehan Al Kautsar
Researcher
Arif Hidayat
Researcher
Raqib Chowdhury
Researcher
Fajri Koto
Researcher
Contact Information
For information about the paper, please contact the authors.
Authors: Nurul Aisyah, Muhammad Dehan Al Kautsar, Arif Hidayat, Raqib Chowdhury, Fajri Koto
Source Publication: View Original PaperLink opens in a new window
Project Contact: Dr. Jianhua Yang
LLM Model Version: gpt-4o-mini-2024-07-18
Analysis Provider: Openai