Мақола

Algorithms for Selecting the Most Efficient Method for Solving Classification Problems

Otabek KhujaevUrgench Branch of Tashkent University of Information Technologies Named After Muhammad al-Khwarizmi,Urgench,UzbekistanBonura B. NurmetovaUrgench Branch of Tashkent University of Information Technologies Named After Muhammad al-Khwarizmi,Urgench,UzbekistanTohir K. UrazmatovUrgench Branch of Tashkent University of Information Technologies Named After Muhammad al-Khwarizmi,Urgench,Uzbekistan

2023en

ABI

Аннотация

This research paper provides an exhaustive analysis of classification algorithms, a central concern in the field of intellectual data analytics. The paper aims to serve as a comprehensive resource for both academic researchers and industry practitioners by offering insights into the selection of the most appropriate classification algorithm tailored to specific analytical and computational needs. The study categorically divides the algorithms into four primary approaches: Probability-Based Methods, Decision Tree Methodologies, Nearest Neighbors Approaches, and Mathematical Function Techniques. Each category is further dissected to include a variety of algorithms, such as Naive Bayes, 1R, ID3, CART, KNN, and Support Vector Machines, among others, along with their numerous adaptations like selective Naive Bayes, GSVM, and TWSVMs.The paper not only lists these algorithms but also delves into their practical applications. For instance, Bayesian classifiers are commonly used in text classification tasks like spam filtering, while decision tree algorithms like CART and CHAID find frequent applications in healthcare for medical diagnosis. Nearest neighbor algorithms, particularly advanced versions like WKPDS, are effective in high-dimensional tasks such as image recognition. Mathematical function-based techniques like SVM and its adaptations are applied in complex scenarios ranging from financial forecasting to natural language processing.Moreover, the paper addresses the critical issue of model reliability by discussing the limitations of conventional data partitioning into training and test sets. It advocates for the use of the K-Fold Cross-Validation method as a more reliable alternative for assessing model performance. This method is particularly useful in scenarios where the test objects are closely related, or there is a lack of diverse learning instances, which could otherwise lead to significant classification errors and ambiguous outcomes.In summary, this paper offers a detailed overview of the landscape of classification algorithms, emphasizing their strengths, limitations, and areas of application. It also provides methodological recommendations for model testing, thereby serving as an invaluable guide for selecting the most effective classification algorithm based on both accuracy and computational efficiency.

Мавзулар

Imbalanced Data Classification Techniques Machine Learning and Data Classification Advanced Statistical Methods and Models

Идентификаторлар

DOI: 10.1109/apeie59731.2023.10347690

Иқтибослар ва манбалар

1 та иқтибос 20 та фойдаланилган манба

Кўрсаткичлар — AkademScholar · Тез орада