基于Boosting的集成k-NN软件缺陷预测方法被引量：7

Boosting-Based k-NN Learning for Software Defect Prediction

导出

摘要软件缺陷预测是改善软件开发质量,提高测试效率的重要途径.文中提出一种基于软件度量元的集成k-NN软件缺陷预测方法.首先,该方法在不同的Bootstrap抽样数据集上迭代训练生成一个基本k-NN预测器集合.然后,这些基本预测器分别对软件模块进行独立预测,各基本预测值将被融合生成最终的预测结果.为判别新的软件模块是否为缺陷模块,设计分类阈值的自适应学习方法.集成预测结果大于该阈值的模块将被识别为缺陷模块,反之则为正常模块.NASAMDP及PROMISEAR标准软件缺陷数据集上的实验结果表明集成k-NN缺陷预测的性能较之广泛采用的对比缺陷预测方法有较明显的提高,同时也证明软件度量元在缺陷预测中的有效性. Timely identification of defective modules improves both software quality and testing efficiency. A software metrics-based ensemble k-NN algorithm is proposed for software defect prediction. Firstly, a set of base k-NN predictors is constructed iteratively from different bootstrap sampling datasets. Next, the base k-NN predictors estimate the software module independently and their individual outputs are combined as the composite result. Then, an adaptive threshold training approach is designed for the ensemble to classify new software modules. If the composite result is greater than the threshold value, the software module is recognized as defective, otherwise as normal. Finally, the experiments are conducted on NASA MDP and PROMISE AR datasets. Compared with a widely referenced defect prediction approach, the results show the considerable improvements of the ensemble k-NN and prove the effectiveness of software metrics in defect prediction.

作者何亮宋擒豹沈钧毅

机构地区西安交通大学电子与信息工程学院

出处《模式识别与人工智能》 EI CSCD 北大核心 2012年第5期792-802,共11页 Pattern Recognition and Artificial Intelligence

基金国家自然科学基金资助项目(No.90718024)

关键词软件缺陷预测 k-近邻(k-NN) 软件度量元集成学习 Software Defect Prediction, k Nearest Neighbor （k-NN）, Software Metric, EnsembleLearning

分类号 TP311.53 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献16

1Nikora A, Munson J. Developing Fault Predictors for Evolving Soft- ware Systems//Proc of the 9th International Software Metrics Sym- posium. Sydney, Australia, 2003:338-350.
2Nagappan N, Ball T. Static Analysis Tools as Early Indicators of Prerelease Defect Density// Proc of the 27th International Confer- ence on Software Engineering. St. Louis, USA, 2005:580-586.
3Menzies T, Greenwald J, Frank A. Data Mining Static Code Attrib- utes to Learn Defect Predictors. IEEE Trans on Software Engineer- ing, 2007, 33(1): 2-13.
4Lessmann S, Baesens B, Mues C, et al. Benchmarking Classifica- tion Models for Software Defect Prediction: A Proposed Framework and Novel Findings. IEEE Trans on Software Engineering, 2008, 34 (4) : 485-496.
5Khoshgoftaar T M, Seliya N. Analogy-Based Practical Classification Rules for Software Quality Estimation. Empirical Software Engineer-ing, 2003, 8(4) : 325-350.
6Emam K E, Benlarbi S, Goel N, et al. Comparing Case-Based Rea- soning Classifiers for Predicting High Risk Software Components. Journal of Systems and Software, 2001,55 (3) : 301-320.
7Turhan B, Bener A. Analysis of Naive Bayes' Assumptions on Soft- ware Fault Data: An Empirical Study. Data and Knowledge Engi- neering, 2009, 68(2) : 278-290.
8Khoshgoftaar T M, Allen E B, Hudepohl J P, et al. Application of Neural Networks to Software Quality Modeling of a Very Large Tele- communications System. IEEE Trans on Neural Networks, 1997, $ (4) : 902-909.
9Zheng Jun. Cost-Sensitive Boosting Neural Networks for Software Defect Prediction. Expert Systems with Applications, 2010, 37 (6) : 4537-4543.
10Selby R W, Porter A A. Learning from Examples: Generation and Evaluation of Decision Trees for Software Resource Analysis. IEEE Trans on Software Engineering, 1988, 14(12) : 1743-1757.

同被引文献42

1杨学兵,张俊.决策树算法及其核心技术[J].计算机技术与发展,2007,17(1):43-45. 被引量：87
2邵堃,刘宗田,胡学钢,李心科.AML:一种面向需求的多Agent建模语言[J].模式识别与人工智能,2007,20(1):131-137. 被引量：4
3王立宏.面向测试过程的软件可信性度量[D].西安:西安邮电大学,2013:22-24.
4Xiang Zhuoyuan, Tang Zhitao. Research of SoftwareDetect Prediction Model Based on Gray Theory[C]//IEEE Management and Service Science2009. USAWashington DC:IEEE Computer Society,2009 :20-22.
5Zhu Dianqin, Wu Zhongyuan. The Application of Gray-Prediction Theory in the Software Defect Management[C]//IEEE Computational Intelligence and SoftwareEngineering 2009. USA Washington DC: IEEE Com-puter Society,2009 : 11-13.
6Halkidi M,Spinellis D,Tsatsaronis G,et al.Data mining in software engineering[J].Intelligent Data Analysis,2011,15:413-441.
7Xie T,Thummalapenta S,Lo D,et al.Data mining for software engineering[J].IEEE Computer,2009,42:55-62.
8Kalsi M,Singh J.A hybrid approach of module sequence generation using neural network for software architecture[J].International Journal of Science and Research,2013,2(5):133-137.
9Singh Y,Kaur A,Malhotra R.Software fault proneness prediction using support vector machines[C]//Proceedings of the World Congress on Engineering,2009:1-6.
10Varade S M,Ingle M D.Overview of software fault prediction using clustering approaches and tree data structure[J].The International Journal of Engineering and Science,2012,1:239-242.

引证文献7

1王曙燕,黄炜青,孙家泽.基于改进GM(1,1)模型预测软件缺陷率[J].西安邮电大学学报,2015,20(6):69-73.
2张飞.改进PSO-ISVM算法的软件缺陷预测[J].计算机工程与应用,2016,52(11):17-21. 被引量：2
3张志武,荆晓远,吴飞.基于二次学习的半监督字典学习软件缺陷预测[J].模式识别与人工智能,2017,30(3):242-250. 被引量：3
4杨杰,燕雪峰,张德平.基于Boosting的代价敏感软件缺陷预测方法[J].计算机科学,2017,44(8):176-180. 被引量：6
5贾晓琳,樊帅帅,罗雪,朱晓燕.应用非线性加权的集成学习软件缺陷序列预测算法[J].西安交通大学学报,2017,51(7):156-161. 被引量：6
6张肖,王黎明.一种半监督集成学习软件缺陷预测方法[J].小型微型计算机系统,2018,39(10):2138-2145. 被引量：7
7杨晓琴.基于改进蝙蝠算法的软件缺陷预测模型[J].计算机技术与发展,2018,28(12):74-78. 被引量：3

二级引证文献26

1张彤,李英梅.基于聚类和混合采样的软件缺陷预测研究[J].哈尔滨师范大学自然科学学报,2022,38(2):58-63.
2周美玲,高春艳.山区灾害滑坡风险非线性预测研究[J].灾害学,2018,33(4):23-27. 被引量：4
3徐萌,席泽西,王雍赟,李晓露.基于集成学习的航空发动机故障诊断方法[J].中国民航大学学报,2019,37(2):29-33. 被引量：7
4莫有印.应用于软件缺陷预测模型的量子粒子群优化BP算法[J].现代电子技术,2019,42(15):127-130. 被引量：2
5刘文杰,江贺.基于特征选择的软件缺陷报告严重性评估[J].计算机工程,2019,45(8):80-85. 被引量：5
6吴方君.静态软件缺陷预测研究进展[J].计算机科学与探索,2019,13(10):1621-1637. 被引量：13
7柴海燕,丁霞,王凯风,谢立鹏.基于三层知识模型的软件缺陷管理系统[J].计算机测量与控制,2020,28(1):127-129. 被引量：4
8白首华,胡天彤.微型嵌入式软件静态缺陷预测系统优化设计[J].现代电子技术,2020,43(10):97-99. 被引量：4
9曾路,汪浩.基于机器学习的虚拟仪器软件缺陷预测模型研究[J].自动化与仪器仪表,2020(5):59-62. 被引量：7
10国婷.差异化结构下排版软件运行缺陷优化预测仿真[J].计算机仿真,2020,37(5):341-344. 被引量：3

1赵娟娟.数字图像边缘检测方法的对比分析及优化[J].甘肃科学学报,2012,24(3):143-146. 被引量：6
2蔡娜,王俊英,刘惟一.一种基于小数据集的贝叶斯网络学习方法[J].云南大学学报（自然科学版）,2007,29(4):359-363. 被引量：2
3贾庆贤,张迎春,管宇,陈雪芹.基于解析模型的非线性系统故障诊断方法综述[J].信息与控制,2012,41(3):356-364. 被引量：32
4石晓荣,王青,张明廉,毕静.基于多传感器数据融合的机动目标跟踪自适应学习方法[J].系统仿真学报,2002,14(5):631-633. 被引量：8
5楚有斌,唐瑞春,王介强.基于属性约简的决策树算法研究[J].电脑知识与技术,2007(8):830-831.
6文卫平,汪滢.一种基于小波变换数据融合的边缘检测算法[J].现代电子技术,2009,32(15):83-85.
7史艳翠,孟祥武,张玉洁,王立才.一种上下文移动用户偏好自适应学习方法[J].软件学报,2012,23(10):2533-2549. 被引量：11
8李勇,苏放,范英磊,徐惠民.基于核Fisher的理想投影空间及分类阈值确定[J].数据采集与处理,2008,23(B09):27-32.
9邓建军,徐立鸿,吴启迪.单输入单输出模糊系统的自适应学习方法[J].微型电脑应用,2001,17(7):19-21. 被引量：5
10李宇航.基于网络环境下数学的个性化自适应学习方法[J].中学数学研究（华南师范大学）（下半月）,2014(7):23-25. 被引量：1

模式识别与人工智能

2012年第5期

浏览历史

内容加载中请稍等...

基于Boosting的集成k-NN软件缺陷预测方法被引量：7

参考文献16

同被引文献42

引证文献7

二级引证文献26

相关作者

相关机构

相关主题

浏览历史

基于Boosting的集成k-NN软件缺陷预测方法 被引量：7

参考文献16

同被引文献42

引证文献7

二级引证文献26

相关作者

相关机构

相关主题

浏览历史

基于Boosting的集成k-NN软件缺陷预测方法被引量：7