摘要
支持向量机对于非平衡数据的分类效果不是十分理想;某些新型装备的故障数据较少且难于采集,正常数据则能够较为容易获得足够数量;这就使得支持向量机的诊断效果受到较大影响,如诊断精度降低,漏报、虚警概率的提高;文章借鉴距离最大熵样本欠采样原理,并引入条件熵的概念,提出了距离条件最大熵欠采样策略,用以改善支持向量机对于非平衡样本的诊断性能,实验表明该方法可行有效。
The classification performance of SVM is not very good to the imbalanced data. For new equipment, the failure data are difficult to find and acquire, the enough normal data are much easier to acquire than failure data. There are some bad influences on classification effect of SVM to this problem, such as degrading diagnosis accuracy, increasing failure missing report and false alarm. In this paper, the con ditional entropy is introduced based on the distance maximum entropy undersarnpling', the distance conditional maximum entropy is utilized to improve the diagnosis performance of SVM. The simulation is done to testify its validity.
出处
《计算机测量与控制》
CSCD
北大核心
2012年第5期1203-1204,1235,共3页
Computer Measurement &Control
关键词
支持向量机
距离最大熵
条件熵
非平衡样本
SVM
distance maximum entropy
conditional entropy
imbalanced data sets