期刊文献+

基于半监督集成学习的软件缺陷预测 被引量:8

Semi-supervised Ensemble Learning Based Software Defect Prediction
下载PDF
导出
摘要 在软件缺陷预测中,标记样本不足与类不平衡问题会影响预测结果.为了解决这些问题,文中提出基于半监督集成学习的软件缺陷预测方法.该方法利用大量存在的未标记样本进行学习,得到较好的分类器,同时能集成一系列弱分类器,减少多数类数据对预测产生的偏倚.考虑到预测风险成本问题,文中还采用训练样本集权重向量更新策略,降低有缺陷模块预测为无缺陷模块的风险.在NASA MDP数据集上的对比实验表明,文中方法具有较好的预测效果. The software defect prediction is usually adversely affected by the limitation of the labeled modules and the class-imbalance of software defect data. Aiming at this problem, a semi-supervised ensemble learning software defect prediction approach is proposed. High-performance classifiers can be built through semi-supervised ensemble learning by using a large amount of unlabeled modules and a better prediction capability is achieved for class-imbalanced data by using a series of weak classifiers to reduce the bias generated by the majority class. With the consideration of the cost of risk in software defect prediction, a sample weight vector updating strategy is employed to reduce the cost of risk caused by misclassifying defective modules as non-defective ones. Experimental results on NASA MDP datasets show better software defect prediction capability of the proposed approach.
作者 王铁建 吴飞 荆晓远 WANG Tiejian WU Fei JING Xiaoyuan(State Key Laboratory of Software Engineering, School of Computer, Wuhan University, Wuhan 430072 School of Automation, Nanjing University of Posts and Telecommunications, Nanjing 210023)
出处 《模式识别与人工智能》 EI CSCD 北大核心 2017年第7期646-652,共7页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金项目(No.61272273)资助~~
关键词 软件缺陷预测 类不平衡 半监督学习 集成学习 Software Defect Prediction, Class-Imbalance, Semi-supervised Learning, Ensemble Learning
  • 相关文献

同被引文献56

引证文献8

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部