Evaluation of Classical and Ensemble Machine Learning Algorithms in KNN-Based Missing Data Imputation
Abstract
This study examined methods for analyzing data with complex structures, extreme values, and NaN values using machine learning models. The techniques of removing NaN values and using KNN imputation to fill in the dataset were compared. After the dataset was processed in different ways, it was evaluated using both conventional machine learning techniques and the ensemble learning-based XGB model. The results showed that applying KNN imputation to the data improved the accuracy of almost all models. The XGB ensemble learning model obtained 85% accuracy after KNN imputation and 82% accuracy after NaN values were eliminated. The findings demonstrate that using imputation techniques to deal with missing data significantly improves the generalizability of models.