期刊文献+

基于多次随机欠采样和POSS方法的软件缺陷检测 被引量:8

Random undersampling and POSS method for software defect prediction
原文传递
导出
摘要 为了解决因软件缺陷数据存在数据不平衡问题限制了分类器的性能,将POSS(pareto optimization for subset selection)特征选择算法和随机欠采样技术引入到软件缺陷检测中,并利用支持向量机(support vector machine,SVM)构建预测模型。试验结果表明,通过多次随机欠采样可以有效地解决软件缺陷数据不平衡问题,同时使用POSS方法对目标子集进行双向优化,从而提高分类的准确率,其结果要优于Relief、Fisher、M I(mutual information)特征选择算法。 In order to solve the problem of imbalance distribution in software defect prediction,POSS( pareto optimization for subset selection) feature selection and random undersampling was applied in this paper,and SVMwas used to build the prediction model. The experimental results showed that the problem could be solved effectively by using multiple random undersampling,and the POSS method was treated subset selection as a bi-objective optimization,which could improve the accuracy of classification,the effectiveness of proposed method was verified by comparing with Relief、Fisher、MI( mutual information).
作者 方昊 李云
出处 《山东大学学报(工学版)》 CAS 北大核心 2017年第1期15-21,共7页 Journal of Shandong University(Engineering Science)
基金 江苏省自然科学基金资助项目(BK20131378 BK20140885) 广西高校云计算与复杂系统重点实验室资助项目(15206)
关键词 软件缺陷检测 不平衡性 数据采样 特征选择 software defect prediction class imbalance data sampling feature selection
  • 相关文献

参考文献2

二级参考文献55

  • 1赵世奇,张宇,刘挺,陈毅恒,黄永光,李生.基于类别特征域的文本分类特征选择方法[J].中文信息学报,2005,19(6):21-27. 被引量:21
  • 2苏金树,张博锋,徐昕.基于机器学习的文本分类技术研究进展[J].软件学报,2006,17(9):1848-1859. 被引量:389
  • 3Li G-Z, Yang J Y. Feature selection for ensemble learning and its application[M]. Machine Learning in Bioinformatics, 2008: 135-155.
  • 4Sheinvald J, Byron Dom, Wayne Niblack. A modelling approach to feature selection[J]. Proc of 10th Int Conf on Pattern Recognition, 1990, 6(1): 535-539.
  • 5Cardie C. Using decision trees to improve case-based learning[C]. Proc of 10th Int Conf on Machine Learning. Amherst, 1993: 25-32.
  • 6Modrzejewski M. Feature selection using rough sets theory[C]. Proc of the European Conf on Machine ,Learning. 1993: 213-226.
  • 7Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data[J]. J of Bioinformatics and Computational Biology, 2005, 3(2): 185-205.
  • 8Francois Fleuret. Fast binary feature selection with conditional mutual information[J]. J of Machine Learning Research, 2004, 5(10): 1531-1555.
  • 9Kwak N, Choi C-H. Input feature selection by mutual information based on Parzen window[J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 2002, 24(12): 1667-1671.
  • 10Novovicova J, Petr S, Michal H, et al. Conditional mutual information based feature selection for classification task[C]. Proc of the 12th Iberoamericann Congress on Pattern Recognition. Valparaiso, 2007: 417-426.

共引文献286

同被引文献84

引证文献8

二级引证文献188

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部