Skip to main content Skip to navigation

Enhancing Security and Strengthening Defenses in Automated Short-Answer Grading Systems

Project Overview

The document examines the use of generative AI in education, particularly focusing on transformer-based automated short-answer grading systems within medical education. It highlights the vulnerabilities of these grading systems, which can be exploited through adversarial gaming strategies, leading to concerns about the integrity of assessments. Various manipulation techniques are identified, prompting the implementation of adversarial training methods aimed at improving the robustness of these AI systems. The effectiveness of these methods is demonstrated, showing a significant reduction in susceptibility to manipulative inputs. Ultimately, the findings underscore the necessity for ongoing enhancements in AI-powered educational tools to maintain reliability and fairness, especially in high-stakes testing environments, thereby ensuring that the integration of generative AI in education supports equitable assessment practices.

Key Applications

Automated Short Answer Grading (ASAG) system using transformer models

Context: Medical education, specifically for assessing student responses in exams

Implementation: The ACTA system utilizes transformer-based methods and incorporates adversarial training techniques to improve grading accuracy and reduce vulnerabilities to gaming strategies.

Outcomes: Increased robustness of grading systems, reduced false positive rates, and improved trustworthiness of automated grading tools.

Challenges: Vulnerability to various gaming strategies, particularly those that exploit the system's weaknesses, and potential overfitting to known strategies.

Implementation Barriers

Technical

Automated grading systems are susceptible to adversarial attacks that can manipulate scoring.

Proposed Solutions: Implement adversarial training methods and ensemble techniques to enhance robustness.

Ethical

Concerns about fairness, transparency, and trust in AI-based scoring, especially regarding potential penalties for legitimate test-taking strategies.

Proposed Solutions: Ensure that AI systems do not unfairly penalize diverse linguistic styles and adopt ethical principles in development.

Project Team

Sahar Yarmohammadtoosky

Researcher

Yiyun Zhou

Researcher

Victoria Yaneva

Researcher

Peter Baldwin

Researcher

Saed Rezayi

Researcher

Brian Clauser

Researcher

Polina Harikeo

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Sahar Yarmohammadtoosky, Yiyun Zhou, Victoria Yaneva, Peter Baldwin, Saed Rezayi, Brian Clauser, Polina Harikeo

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

Let us know you agree to cookies