Статья

Grade-Level Classification of Uzbek Literary Texts Using Text Complexity Features (Grades 5-9)

Sapura SattarovaAl-Beruni Urgench State University,Department of Computer Science,Urgench,Uzbekistan

2025

ABI

Аннотация

This study introduces a supervised machine-learning framework for determining the grade-level suitability of Uzbek literary texts for students in Grades 5–9. In light of growing demands for personalized and competency-based education, aligning reading materials with students’ cognitive development remains a critical challenge—particularly in low-resource languages such as Uzbek. We construct a grade-specific School Textbook Corpus from 66 official textbooks as the empirical basis for linguistic profiling. Each document is processed to extract key complexity indicators—average sentence length, type–token ratio (TTR), conjunction density(CD), and term frequency–inverse document frequency (TF–IDF) features from word- and character-level n-grams. These features are normalized and used to train a multinomial logistic-regression classifier that predicts the most suitable grade level for unseen texts. The model is evaluated on internal textbook segments and external literary works, including titles recommended for the national certification exam in literature. Results demonstrate high accuracy in grade classification, validating the model’s applicability for educators, curriculum designers, and ed-tech platforms. This approach offers a scalable, interpretable solution for optimizing literary content selection and supports the broader goal of fostering age-appropriate reading comprehension in Uzbek secondary schools.

Темы

Text Readability and Simplification Authorship Attribution and Profiling Reading and Literacy Development

Идентификаторы

DOI: 10.1109/apeie66761.2025.11289265

Цитирования и источники

Цитирований: 0Использованных источников: 12

Показатели — AkademScholar · Скоро