Skip to main content Skip to navigation

Can ChatGPT and Bard Generate Aligned Assessment Items? A Reliability Analysis against Human Performance

Project Overview

The document explores the integration of generative AI tools such as ChatGPT and Google Bard in educational settings, emphasizing their applications in automated assessments and item generation. It outlines the promising potential of these AI technologies to streamline the creation of assessment items, thereby enhancing efficiency in educational processes. However, it also addresses significant limitations in the reliability of these AI-generated assessments when compared to traditional human raters, suggesting that while AI can assist in educational tasks, it still falls short in certain areas of evaluation quality. Overall, the findings indicate that while generative AI holds considerable promise for transforming assessment practices in education, ongoing improvements are necessary to ensure their effectiveness and reliability.

Key Applications

Automated Item Generation (AIG)

Context: Educational assessment for language arts, mathematics, and sciences

Implementation: AI tools like ChatGPT and Google Bard were tested against human raters for their ability to generate and assess writing prompts' complexity.

Outcomes: AI tools show potential in creating assessment items but require further training to match human performance.

Challenges: Low reliability compared to human raters; AI tools need to be fine-tuned for better accuracy in understanding complexity.

Implementation Barriers

Technical Barrier

Generative AI tools currently lack the reliability of human scorers, especially in understanding the complexity of writing prompts.

Proposed Solutions: Further training and fine-tuning of AI models are needed to enhance their performance in educational contexts.

Project Team

Abdolvahab Khademi

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Abdolvahab Khademi

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies