摘要
在软件缺陷预测中,普遍存在软件缺陷数据的类不平衡问题,严重影响着传统预测模型的性能.为了缓解类不平衡对预测模型性能的影响,引入模糊集的思想,提出了一种基于相对密度的模糊加权极限学习机算法.该方法首先采用所提的相对密度方法求出适用于不同数据样本的加权矩阵,进而将其与传统加权极限学习机结合,并训练模糊极限学习机,最后通过NASA软件缺陷类不平衡数据对所提方法的有效性和可行性进行验证.实验结果表明:与诸多类不平衡软件缺陷预测方法相比,文中方法具有更好的预测性能,并在G-mean、AUC和Balance的评价指标上有较优表现.
In software defect predictions,the problem of class imbalance of software defect data is common,seriously affecting the performance of traditional prediction models.To alleviate the impact of class imbalance problem,this paper presents a fuzzy weighted extreme learning machine based on relative density information(FWELM-RD)algorithm and fuzzy set.First,a suitable weight matrix for different data samples is constructed based the proposed relative density information.Next,the weight matrix is combined with the traditional weighted extreme learning machine,and then a fuzzy extreme learning machine is trained.Finally,the validity and feasibility of the proposed method are verified by NASA software defect imbalanced datasets.The experiment results indicate that the proposed method can acquire better performance than traditional defect prediction model for class imbalance problem.Furthermore,FWELM-RD performs better in terms of the measures including G-mean,AUC and Balance.
作者
郑尚
孙丹
于化龙
ZHENG Shang;SUN Dan;YU Hualong(School of Computer Science,Jiangsu University of Science and Technology,Zhenjiang 212003,China)
出处
《江苏科技大学学报(自然科学版)》
CAS
2019年第4期67-73,共7页
Journal of Jiangsu University of Science and Technology:Natural Science Edition
基金
国家自然科学基金资助项目(61305058)
江苏省自然科学基金资助项目(BK20130471)
中国博士后特别资助计划项目(2015T80481)
中国博士后科学基金资助项目(2013M540404)
江苏省博士后基金资助项目(1401037B)
江苏科技大学2015人才引进项目
江苏省高校自然科学基金资助项目(18KJB520011)
关键词
软件缺陷预测
数据不平衡
相对密度
模糊加权
software defect prediction
data imbalance
relative density
fuzzy weight