软件缺陷集成预测模型研究被引量：6

Software defect prediction based on classifiers ensemble

下载PDF

导出

摘要利用单一分类器构造的缺陷预测模型已经遇到了性能瓶颈,而集成分类器相比单一分类器往往具有显著的性能优势。以构造高效的集成缺陷预测模型为出发点,比较了七种不同类型集成分类器的算法和特点。在14个基准数据集上的实验显示,部分集成预测模型的性能优于基于朴素贝叶斯的单一预测模型。其中,基于投票的集成分类框架具有最优的预测性能以及统计学意义上的性能优势显著性,随机森林算法次之。Stacking集成框架也具有较强的泛化能力。 Software defect prediction using classification algorithms was advocated by many researchers.However,several new literatures show the performance bottleneck by applying a single classifier recent years.On the other hand,classifiers ensemble can effectively improve classification performance than a single classifier.This paper conducted a comparative study of various ensemble methods with perspective of taxonomy.A series of benchmarking experiments on public-domain datasets MDP show that applying classifiers ensemble methods to predict defect could achieve better performance than using a single classifier.Specially,in all seven ensemble methods evolved by this experiments,voting and random forest have obvious performance superiority than others,and Stacking also has better generalization ability.

作者刘小花王涛吴振强

机构地区陕西师范大学计算机科学学院

出处《计算机应用研究》 CSCD 北大核心 2013年第6期1734-1738,共5页 Application Research of Computers

基金国家自然科学基金面上项目(61173190) 陕西省自然科学基础研究计划项目(2009JM8002) 中央高校基本科研业务费专项资金资助项目(GK201302055)

关键词软件缺陷预测集成分类投票随机森林 software defect prediction classifiers ensemble vote random forest

分类号 TP311.5 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献17

1MENZIES T, MILTON Z, TURHAN B, et al. Defect prediction from static code features : current results, limitations, new approaches [ J ]. Automated Software Engineering, 2010,17 ( 5 ) : 375-407.
2JIANG Y, CUKIC B, MENZIES T, et al. Comparing design and code metrics for software quality prediction [ C ]//Proc of the 4th International Workshop on Predictor Models in Software Engineering. New York : ACM Press ,2008 : 11-18.
3MENZIES T, TURHAN B, BENER A, et al. Implications of ceiling effects in defect predictors [ C ]// Proc of ACM International Conference on Predictive Models in Software Engineering. 2008: 47-54.
4ZHANG H, NELSON A, MENZIES T. On the value of learning from defect dense components for software defect prediction [ C ]// Proc of ACM International Conference on Predictive Models in Software Engineering. 2010 : 1-9.
5STEFANO C D, FONTANELLA F, FOLINO G, et al. A Bayesian approach for combining ensembles of GP classifiers[ C]//Proc of the 10th International Workshop on Multiple Classifier Systems. 2011:26-35.
6TOSUN A, TURHAN B, BERNER A B. Ensemble of software defect predictors : a case study[ C ]//Proc of the 2nd International Symposium on Empirical Software Engineering and Measurement. 2008:318- 320.
7ZHENG J. Cost-sensitive boosting neural networks for software defect prediction[ J]. Expert Systems with Applications, 2010,37 ( 6 ) : 4537-4543.
8BREIMAN L. Bagging predictors [ J]. Machine Learning, 1996,24 (2) :123-140.
9FREUND Y, SCHAPIRE R. Experiments with a new boosting algorithm[C]// Proc of the 13th International Conference on Machine Learning. 1996 : 148-156.
10DIETTERICH T. An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization[J]. Machine Learning,2000,40(2) :139-157.

同被引文献72

1聂林波,刘孟仁.软件缺陷分类的研究[J].计算机应用研究,2004,21(6):84-86. 被引量：39
2刘亦书.CMM/CMMI中同行评审子过程的定量控制[J].计算机工程与设计,2004,25(6):978-981. 被引量：5
3Menzies T,Greenwald J,Frank A.Data mining static code attributes to learn defect predictors[J].IEEE Transactions on Software Engineering,2007,33(1):2-13.
4Turhan B,Bener A.Analysis of Naive Bayes assumptions on software fault data:An empirical study[J].Data&Knowledge Engineering,2009,68(2):278-290.
5Boetticher G D.Improving credibility of machine learner models in software engineering[M]∥Advanced Machine Learner Applications in Software Engineering(Series on Software Engineering and Knowledge Engineering),USA:Langston University,2006:52-72.
6Catal C,Diri B.Investigating the effect of dataset size,metrics sets and feature selection techniques on software fault prediction problem[J].Information Sciences,2009,179(8):1040-1058.
7Riquelme J C,Ruiz R,Rodriguez D,et al.Finding defective modules from highly unbalanced datasets[J].Actas de los Talleres de las Jornadas de Ingeniería del Software y Bases de Datos,2008,2(1):67-74.
8Menzies T,Turhan B,Bener A,et al.Implications of ceiling effects in defect predictors[C]∥Proc of the 4th International Workshop on Predictor Models in Software Engineering,2008:47-54.
9Seiffert C,Khoshgoftaar T M,Van Hulse J.Improving software-quality predictions with data sampling and boosting[J].IEEE Transactions on Systems,Man and Cybernetics,Part A:Systems and Humans,2009,39(6):1283-1294.
10Pelayo L,Dick S.Evaluating stratification alternatives to improve software defect prediction[J].IEEE Transactions on Reliability,2012,61(2):516-525.

引证文献6

1戴翔,毛宇光.基于集成混合采样的软件缺陷预测研究[J].计算机工程与科学,2015,37(5):930-936. 被引量：10
2戴翔,毛宇光.跨机构的软件缺陷集成采样预测研究[J].小型微型计算机系统,2015,36(8):1700-1705. 被引量：5
3张荷,李梅,张阳,蔡晓妍.基于PU学习的软件故障检测研究[J].计算机应用研究,2015,32(11):3324-3327. 被引量：1
4李慧.同行评审软件缺陷预测模型研究[J].科技创业月刊,2015,28(20):106-107. 被引量：1
5刘文英,林亚林,李克文,雷永秀.一种软件缺陷不平衡数据分类新方法[J].山东科技大学学报（自然科学版）,2021,40(2):84-94. 被引量：4
6杨昊天,顾乾晖,王嘉璐,施恺杰,徐力晨.基于混合采样和集成学习的软件缺陷预测[J].网络安全技术与应用,2021(5):59-60.

二级引证文献20

1姜新盈,王舒梵,严涛.基于层次密度聚类的去噪自适应混合采样[J].计算机系统应用,2022,31(10):206-210.
2张彤,李英梅.基于聚类和混合采样的软件缺陷预测研究[J].哈尔滨师范大学自然科学学报,2022,38(2):58-63. 被引量：1
3罗娟,陆东晖,徐霞.基于CMMI4的软件项目量化质量管理研究[J].软件导刊,2016,15(6):8-10. 被引量：6
4韦良芬.基于机器学习的软件缺陷预测技术研究[J].长春大学学报,2017,27(10):7-9. 被引量：3
5陈翔,王莉萍,顾庆,王赞,倪超,刘望舒,王秋萍.跨项目软件缺陷预测方法研究综述[J].计算机学报,2018,41(1):254-274. 被引量：44
6包振栋,张阳,刘斌.PU场景下基于迁移学习的软件缺陷预测[J].计算机工程与设计,2018,39(3):663-667. 被引量：1
7吴帅,赵方.基于随机森林的老年人居住偏好预测研究[J].计算机工程与科学,2018,40(5):924-930. 被引量：3
8简艺恒,余啸.基于数据过采样和集成学习的软件缺陷数目预测方法[J].计算机应用,2018,38(9):2637-2643. 被引量：8
9刘定祥,乔少杰,张永清,韩楠,魏军林,张榕珂,黄萍.不平衡分类的数据采样方法综述[J].重庆理工大学学报（自然科学）,2019,33(7):102-112. 被引量：29
10张洋.一种基于Logicboost的软件缺陷预测方法[J].软件,2019,40(8):79-83. 被引量：1

1周国雄,莫晓山.基于灰色预测和BP的集气管压力集成预测方法[J].仪器仪表学报,2011,32(7):1648-1654. 被引量：14
2董西伟,王玉伟,张广顺,周才学.基于迁移学习的跨公司软件缺陷预测[J].计算机工程与设计,2016,37(3):684-689. 被引量：1
3周星,丁立新,万润泽,葛强.分类器集成算法研究[J].武汉大学学报（理学版）,2015,61(6):503-508. 被引量：25
4周华平,李敬兆.多阶灰色支持向量机集成预测模型研究[J].计算机工程与科学,2015,37(3):539-546. 被引量：3
5谭琦,杨沛.基于集成预测的稀有时间序列检测[J].计算机应用研究,2008,25(9):2620-2622.
6刘素梅,刘惠梅.基于支持向量机算法的网站分类器构造方法[J].太原科技大学学报,2007,28(1):15-18.
7李晨.改善的支持向量机图像分割分类器构造[J].计算机与数字工程,2015,43(2):316-319.
8吴川,姜淑娟.基于图和信息熵的入侵检测分类器构造[J].微计算机信息,2008,24(30):62-64. 被引量：1
9陈若雷,王科俊,贺波,冯伟兴.考虑多尺度特征的固有不规则蛋白质预测方法[J].哈尔滨工程大学学报,2012,33(9):1138-1143.
10丁建伟,刘伟.复杂场景中的视觉跟踪研究[J].电脑知识与技术,2016,0(1):196-197.

计算机应用研究

2013年第6期

浏览历史

内容加载中请稍等...

软件缺陷集成预测模型研究被引量：6

参考文献17

同被引文献72

引证文献6

二级引证文献20

相关作者

相关机构

相关主题

浏览历史

软件缺陷集成预测模型研究 被引量：6

参考文献17

同被引文献72

引证文献6

二级引证文献20

相关作者

相关机构

相关主题

浏览历史

软件缺陷集成预测模型研究被引量：6