摘要
提出一种基于随机森林方法的异常样本(outliers)检测方法。仿真实验表明,与其他2种基于距离的异常样本检测技术相比,这种方法可以更好地提高模型的准确率,且具有较强的鲁棒性,在处理大规模数据集时还能显著地减少计算时间。
It introduces an outliers detection method based on random forest. Compared with the other two common outliers detection methods based on distance, the proposed method can improve the performance and robustness of the model and can also reduce the computation time.
出处
《福建工程学院学报》
CAS
2007年第4期392-396,共5页
Journal of Fujian University of Technology
关键词
异常样本检测
随机森林
马氏距离
outlier detection
random forest
Mahalanobis distance