Automated assessment of non-native learner essays: Investigating the role of linguistic features

Project Overview

The document examines the application of generative AI in education, focusing on Automated Essay Scoring (AES) systems designed to evaluate written responses from learners, particularly non-native speakers. It details the development of predictive models utilizing two publicly available datasets: TOEFL11 and FCE, and identifies key linguistic features that impact the accuracy of scoring. The study highlights the significance of the learner's native language in influencing outcomes and assesses how well specific features generalize across different datasets. Findings reveal that while certain linguistic features are predictive in varied contexts, there is no single feature set that universally applies. This research underscores the potential of AI tools in educational assessments while also acknowledging the complexity of language evaluation across diverse learner backgrounds. Overall, the use of generative AI in educational settings, particularly through AES, offers promising insights into enhancing writing assessment, albeit with the caveat of context-specific limitations.

Key Applications

Automated Essay Scoring (AES)

Context: Language assessment for non-native English speakers in high-stakes exams like TOEFL and FCE.

Implementation: Developed predictive models using linguistic features from essays written by learners, trained on publicly available datasets.

Outcomes: Achieved prediction accuracy of 73% for TOEFL essays and a Pearson correlation of 0.64 for FCE essays.

Challenges: The influence of native language on prediction accuracy varies and the generalizability of features across different datasets is limited.

Implementation Barriers

Technical

Limited understanding of which linguistic features are most predictive for writing proficiency, alongside the variability in writing styles and proficiency levels among learners from different native language backgrounds.

Proposed Solutions: Exploration of multiple datasets and linguistic features to improve model accuracy, while incorporating native language as a predictive feature and conducting further research on its impact.

Data Availability

Dependence on publicly available datasets, which may not represent all learner backgrounds.

Proposed Solutions: Encouraging the development of more diverse and representative datasets.

Cultural

Variability in writing styles and proficiency levels among learners from different native language backgrounds.

Proposed Solutions: Incorporating native language as a predictive feature and conducting further research on its impact.

Project Team

Sowmya Vajjala

Researcher

Contact Information

For information about the paper, please contact the authors.

Authors: Sowmya Vajjala

Source Publication: View Original PaperLink opens in a new window

Project Contact: Dr. Jianhua Yang

LLM Model Version: gpt-4o-mini-2024-07-18

Analysis Provider: Openai

← Back to Projects