Статья

Data preprocessing on input

Shavkat MadrakhimovNational University of Uzbekistan named after Mirzo Ulugbek, University street 4, Tashkent 100174, UzbekistanKodirbek MakharovNational University of Uzbekistan named after Mirzo Ulugbek, University street 4, Tashkent 100174, UzbekistanMusulmon LolaevNational University of Uzbekistan named after Mirzo Ulugbek, University street 4, Tashkent 100174, Uzbekistan

AIP conference proceedingsjournal2021en

ABI

Аннотация

Many factors affect the success of Machine Learning on a given task. First of all, we need quality data. Data preprocessing is a factor that directly affects the quality of the intellectual analysis process because solving problems with the initial unprocessed sample does not give the expected result, which can lead to erroneous conclusions [1]. This can be caused by a number of errors, such as the repetition of data, the impossible value attributes, missing values, and so on. Such data can occur for a variety of reasons, such as entering data, using different formats or units of measurement, incorrectly deleting recurring value records, and so on [2]. The results of the algorithms used after the detection of errors in the initial sample, logically impossible values of features in the description of objects, and data preprocessing by removing such objects from the sample gives more reliable results.This article proposes to define a range of possible values for each pair of quantitative features. A incorrect object’s data can be identified on entering by the values of a pair of features, that do not fall into the appropriate ranges.

Перевод пока недоступен

Темы

Data Quality and Management Data Management and Algorithms Advanced Database Systems and Queries

Идентификаторы

DOI: 10.1063/5.0058132

Цитирования и источники

Цитирований: 5 Использованных источников: 13

Показатели — AkademScholar