摘要
由于SVM(Support Vector Machine)在有离群点和不平衡数据的问题中分类性能相对较低,有研究者提出了一种面向不均衡分类的隶属度加权模糊支持向量机,只是文中的模糊隶属度并不能较好衡量样本点对确定最佳分划超平面所做的贡献大小。针对以上问题提出了密度峰(Density Peaks,DP)聚类的可信性加权模糊支持向量机。首先由DP聚类找到离群点后剔除。再根据点到由DEC(Different Error Costs)确定的超平面的距离,得到初始隶属度,并用改进的FSVM-CIL(Fuzzy Support Vector Machines for Class Imbalance Learning)更新隶属度。之后剔除部分样本点,起到简约样本的作用,并减少数据不平衡带来的影响。通过实验验证了所提出算法的有效性。
Considering that SVM(Support Vector Machine)has relatively low classification performance in the case of outliers and unbalanced data, a weighted fuzzy support vector machine was proposed. And the fuzzy membership in that paper is not a good measure for the contribution of the sample to the determination of the optimal separating hyperplane.Thus, a DP(Density Peaks)clustering, creditability weighted fuzzy support vector machine is proposed. Outliers are found by DP clustering, then the outliers are eliminated. The distance from every sample to the hyperplane determined by DEC(Different Error Costs)is used to bulid the initial degree of membership. Then the degree of membership is updated with the improved FSVM-CIL(Fuzzy Support Vector Machines for Class Imbalance Learning). Finally, some samples are removed, which reduces the number of samples and reduces the impact of data imbalances. The effectiveness of the proposed algorithm is verified by experiments.
作者
盛晓遐
杨志民
王甜甜
SHENG Xiaoxia;YANG Zhimin;WANG Tiantian(College of Science, Zhejiang University of Technology, Hangzhou 310023, China;Zhijiang College, Zhejiang University of Technology, Hangzhou 310024, China)
出处
《计算机工程与应用》
CSCD
北大核心
2019年第10期169-178,共10页
Computer Engineering and Applications
基金
国家自然科学基金(No.10926198)
浙江省自然科学基金(No.LY16A010020)