Article

Student Dropout Risk Classification Using CatBoost Algorithm in Higher Education Retention Systems

Nargis KurbanazarovaTermez State University,UzbekistanPriya SethuramanSt. Joseph’s Institute of Technology,Department of Management Studies,Chennai,Tamil Nadu,India,600 119Deepak PSt.Joseph's Institute of Technology,Department of Management Studies,Chennai,Tamil Nadu,India,600 119Najmitdinov Akhadkhon KhamitdkhanovichYasir Mahmood YounusImam Al-Kadhum College (IKC),Department of Computer Techniques Engineering,Baghdad,IraqAtul Dattatraya GhateKalinga University,Department of Management,Raipur,India

2025

ABI

Abstract

A lot of college students don't finish their degrees, which is bad for both the students and the schools. Finding at-risk students quickly is very important so they can get the help and resources they need. The CatBoost algorithm is good at working with datasets that aren't balanced and have categorical features. This study suggests a dropout risk classification model that uses it. The dataset has information about the demographics, level of engagement, and academic performance of college students. After choosing the features and cleaning up the data, the CatBoost model was trained and tested against more standard classifiers. CatBoost was the most accurate of the three methods tested, with an F1-score of 93.6%, compared to 91.2% for Random Forest, SVM, and Logistic Regression. Using SHAP (Shapley Additive exPlanations) values for feature interpretation, we found that attendance and academic performance were two of the most important factors that predicted dropout. The suggested model makes it easier for schools to use data to make decisions and keep students in school. This study demonstrates the importance of gradient boosting methods in developing systems that help students stay in school longer.

Topics

Online Learning and Analytics Educational Technology and Assessment Intelligent Tutoring Systems and Adaptive Learning

Identifiers

DOI: 10.1109/aistemedu67077.2025.11403945

Citations and references

Cited by 016 references

Metrics — AkademScholar