Статья

Hybrid Approach to Automated Essay Scoring: Integrating Deep Learning Embeddings with Handcrafted Linguistic Features for Improved Accuracy

Muhammad FaseehDepartment of Electronic Engineering, Jeju National University, Jeju-si 63243, Republic of KoreaAbdul JaleelDepartment of Information Technology, Asia Pacific International College, Parramatta, Sydney 2150, AustraliaNaeem IqbalCentre for Secure Information Technologies (CSIT), Momentum One Zero (M1.0), School of Electronics Electrical Engineering and Computer Science, Queen’s University Belfast, Belfast BT3 9DT, UKAnwar GhaniBig Data Research Center, Department of Computer Engineering, Jeju National University, Jeju-si 63243, Republic of KoreaAkmalbek AbdusalomovDepartment of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, UzbekistanAsif MehmoodDepartment of Biomedical Engineering, College of IT Convergence, Gachon University, Sujeong-gu, Seongnam-si 13120, Republic of KoreaYoung Im ChoDepartment of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 13120, Republic of Korea

Mathematicsjournal2024en

ABI

Аннотация

Automated Essay Scoring (AES) systems face persistent challenges in delivering accuracy and efficiency in evaluations. This study introduces an approach that combines embeddings generated using RoBERTa with handcrafted linguistic features, leveraging Lightweight XGBoost (LwXGBoost) for enhanced scoring precision. The embeddings capture the contextual and semantic aspects of essay content, while handcrafted features incorporate domain-specific attributes such as grammar errors, readability, and sentence length. This hybrid feature set allows LwXGBoost to handle high-dimensional data and model intricate feature interactions effectively. Our experiments on a diverse AES dataset, consisting of essays from students across various educational levels, yielded a QWK score of 0.941. This result demonstrates the superior scoring accuracy and the model’s robustness against noisy and sparse data. The research underscores the potential for integrating embeddings with traditional handcrafted features to improve automated assessment systems.

Темы

Topic Modeling Natural Language Processing Techniques Software Engineering Research

Идентификаторы

DOI: 10.3390/math12213416

Цитирования и источники

Цитирований: 2 Использованных источников: 48

Показатели — AkademScholar · Скоро