Article

The Impact of Normalization on Regression-Based Crop Yield Prediction: Accuracy and Efficiency Analysis

Dilmurod KhasanovJizzakh Branch of the National University of Uzbekistan,Department of Information Systems and Technologies,Jizzakh,UzbekistanBarno DaminovaKarshi State University,Department of Algorithms and Programming Technologies,Karshi,UzbekistanMa'ruf TojiyevSamarkand State University,Department of Management Theory and Information Security,Samarkand,Uzbekistan

2026

ABI

Abstract

Normalization is a fundamental step in machine learning data preprocessing, often greatly influencing model performance. This research focuses on the effects of different normalization methods when applied to regression problems by considering both predictive accuracy and computational efficiency. As case studies, this study uses two publicly available datasets of crop yield prediction challenges provided through Kaggle. A total of four widely applied regression algorithms are trained and tested in two experimental settings: with and without data normalization. Their performance was measured using common metrics for the assessment of regression models (RMSE, MAE, and R<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>). To account for computational efficiency, indicators such as training time, memory used, and convergence behavior were considered. The experimental results indicated that for gradient-based algorithms, namely Linear Regression and MLP, normalization mostly played a critical role, enhancing both the accuracy and convergence speed when working with datasets that had heterogeneous feature scales. On the other hand, the effect on tree-based models (RF and GBRT) was insignificant due to their intrinsic scale invariance. Secondly, this study highlights that poor preprocessing choices may lead to a longer training or unstable learning dynamics in neural systems. These findings emphasize the importance of aligning normalization strategies with the nature of the dataset and the algorithm chosen. The results provide practical guidelines for optimizing regression chains of agricultural yield prediction and will generalize to other data-driven decision-making problems in science and industry.

Topics

Smart Agriculture and AI Advanced Statistical Methods and Models Agricultural Economics and Practices

Identifiers

DOI: 10.1109/smartindustrycon68821.2026.11493015

Citations and references

Cited by 018 references

Metrics — AkademScholar · Coming soon