Перейти к основному содержанию
AkademIndex

Продукты

Для разработчиков

AkademBaseОткрытый API экосистемы
Статья

Customized K-nearest neighbors’ algorithm for malware detection

Mosleh M. AbualhajDepartment of Networks and Cybersecurity, Al-Ahliyya Amman University, Amman, 19328, JordanAhmad Adel Abu-SharehaDepartment of Data Science and Artificial Intelligence, Al-Ahliyya Amman University, Amman, 19328, JordanQusai Y. ShambourDepartment of Software Engineering, Faculty of Information Technology, Al-Ahliyya Amman University, Amman, JordanAdeeb AlsaaidahDepartment of Networks and Cybersecurity, Al-Ahliyya Amman University, Amman, 19328, JordanSumaya N. Al-KhatibDepartment of Computer Science, Faculty of Information Technology, Al-Ahliyya Amman University, Amman, JordanMohammed AnbarNational Advanced IPv6 Centre (NAv6), Universiti Sains Malaysia, Penang, Malaysia
2023en
ABI

Аннотация

The security and integrity of computer systems and networks highly depend on malware detection. In the realm of malware detection, the K-Nearest Neighbors (KNN) algorithm is a well-liked and successful machine learning algorithm. However, the choice of an acceptable distance metric parameter has a significant impact on the KNN algorithm's performance. This study tries to improve malware detection by adjusting the KNN algorithm's distance metric parameter. The distance metric greatly influences the similarity or dissimilarity between instances in the feature space. The KNN algorithm for malware detection can be more accurate and effective by carefully choosing or modifying the distance metric. This paper analyzes multiple distance metrics, including Minkowski distance, Manhattan distance, and Euclidean distance. These metrics account for the traits of malware samples while capturing various aspects of similarity. The effectiveness of the KNN algorithm is evaluated using the MalMem-2022 malware dataset, and the results are broken down into these three-distance metrics. The experimental findings show that, among the three distance metric parameters, the Euclidean and Minkowski distance metric parameters considerably produced the best outcomes with binary classification. While with multiclass classification, the KNN algorithm has achieved the highest outcomes using Manhattan distance.

Перевод пока недоступен

Идентификаторы

Цитирования и источники

Цитирований: 2Использованных источников: 0