摘要
Random Forest作为一种常见的机器学习算法,不仅具备较高的分类回归性能,而且快速高效.传统的Random Forest算法并未在决策树的生成和选择上做深入研究,在本文中笔者提出一种降序去冗的寻优方式对机器学习中监督学习算法Random Forest进行改进,在保证准确率的同时减少随机森林的冗余度,并应用于Android系统的恶意软件检测.经过五折交叉验证法验证,改进的Random Forest算法能够在较低的冗余度下保证较高的准确率,同时改进的算法准确率在与同条件下的原算法的准确率以及OOB模型下的准确率相差在1%以内,在与单模型分类算法KNN和集成式学习算法Adaboost M1的对比试验中改进的Random Forest算法要优于以上两者.
As a common machine learning algorithm, Random Forest not only has high classification and regression performance, but also fast and efficient. In this paper, the authors attempt to use a method called descending remove redundancy to improve the supervised learning random forest algorithm, and improved algorithm can reduce the redundancy while guaranteeing the accuracy, thus applied to malware detection in Android system. Five-fold cross validation method was used in the experiment, and the experimental data show that the improved random forest algorithm can guarantee the higher accuracy at the lowest redundancy,the error among the accuracy of the improved algorithm, the original algorithm and the OOB model under the same conditions are less than 1%, compared with the single model classification algorithm KNN and the ensemble learning algorithm Adaboost M1, the improved algorithm is better.
出处
《新疆大学学报(自然科学版)》
CAS
北大核心
2017年第3期322-327,共6页
Journal of Xinjiang University(Natural Science Edition)
基金
国家自然科学基金(61303231)