Перейти к основному содержанию
AkademIndex

Продукты

Для разработчиков

AkademBaseОткрытый API экосистемы
Статья

FSRF:An Improved Random Forest for Classification

Wenxian FengDepartment of software engineering, Jilin University, Changchun, ChinaChenkai MaDepartment of software engineering, Jilin University, Changchun, ChinaGuozhang ZhaoDepartment of software engineering, Jilin University, Changchun, ChinaRui ZhangDepartment of Computer Science, Jilin University, Changchun, China
2020en
ABI

Аннотация

Random forest algorithm is a flexible and easy-to-use machine learning algorithm, which is widely used in classification problems. However, the traditional random forest has some limitations. Because the randomness added by random forest to decision trees almost only occurs in the feature selection when the decision trees are generated, the fixity of decision trees generation rules will lead to relatively serious over fitting. At the same time, in the face of data with high and unbalanced feature dimensions, the performance of algorithm is seriously weakened because high-dimensional data usually contains many irrelevant and redundant features. To solve these problems, we propose an improved random forest algorithm FSRF. Based on the traditional random forest algorithm, we use the feature selection methods to preprocess the data and get the feature subset with the best classification performance to construct the random forest. At the same time, we introduce sparse matrix projection to improve the generation of the random forest. Experiments show that our method reduces the influence of redundant features on classification and improves the accuracy.

Перевод пока недоступен

Идентификаторы

Цитирования и источники

Цитирований: 4Использованных источников: 0