期刊文献+

一种改进的半监督集成软件缺陷预测方法 被引量:5

Improved Semi-supervised Ensemble Learning Approach for Software Defect Prediction
下载PDF
导出
摘要 针对半监督软件缺陷预测中的类不平衡以及特征中含有过多无关特征和冗余特征的问题,提出一种改进的半监督集成软件缺陷预测方法 Fe SSTri(semi-supervised software prediction using Feature Selecting and Sample and Tri-training).首先使用ADASYN自适应综合过采样算法对部分标记样本进行采样,来解决数据集类不平衡问题;其次利用采样后的数据构建分类器,给未标记数据做预标记,将标记样本与预标记样本结合,使用最小冗余最大相关mRMR算法对数据集进行特征选择,解决无关特征过多和特征冗余问题,最后使用半监督集成算法Tri-training构建最终的半监督缺陷预测模型.本文在NASA数据集和AEEEM数据集上以F1值为评测指标对提出的模型进行了验证.实验结果表明:Fe SSTri方法要优于初始的Tri-training算法,并且与经典的机器学习方法相比,Fe SSTri方法均可以取得更好的预测结果. Aiming at the problem of highly class unbalance of defect datasets and too many irrelevant and redundant features in the process of semi-supervised software defect prediction,an improved semi-supervisedensemble software defect prediction method Fe SSTri(semi-supervised software prediction using Feature reference and Sample and tri-training) is proposed.Firstly,ADASYN adaptive comprehensive over-sampling methodis used to sample some labeled samples to solve the problem of datasets class imbalance.Secondly,the sampled data is used to construct a classifier and pre-label the unlabeled data.The labeled samples are combined with the pre-labeled samples,and the minimum redundancy maximum correlation mRMR method is used to perform feature selection on the data sets,Thismethodcan also solve the feature redundancy problem.Finally,the semi-supervised ensembleTri-training method is used to obtain the final prediction results.This paper verifies the proposed model on the NASA data sets and AEEEMdata sets with F1 values as the evaluation index.The experimental results showthat the Fe SSTri method is better than the initial Tri-training algorithm,and compared with the classic machine learning method,the Fe SSTri method can achieve better prediction results.
作者 周建含 李英梅 李文昊 ZHOU Jian-han;LI Ying-mei;LI Wen-hao(School of Computer Science and Information Engineering,Harbin Normal University,Harbin 150000,China)
出处 《小型微型计算机系统》 CSCD 北大核心 2021年第10期2196-2202,共7页 Journal of Chinese Computer Systems
基金 黑龙江省自然科学基金项目(F2017021)资助 哈尔滨师范大学计算机学院科研项目(JKYKYY202003)资助 哈尔滨师范大学研究生创新科研项目(HSDSSCX2020-58)资助。
关键词 软件缺陷预测 类不平衡 特征选择 半监督预测 机器学习 software defect prediction class imbalance feature selection semi-supervised prediction machine learning
  • 相关文献

参考文献8

二级参考文献141

  • 1王青,伍书剑,李明树.软件缺陷预测技术.软件学报,2008,19(7):1565—1580.http://www.jos.org.cn/1000—9825/19/1565.htm.
  • 2Hall T, Beecham S, Bowes D, Gray D, Counsell S. A systematic literature review on fault prediction performance in software engineering. IEEE Trans. on Software Engineering, 2012,38(6): 1276-1304. [doi: 10.1109/TSE.2011.103 ].
  • 3Radjenovic D, Hericko M, Torkar R, Zivkovic A. Software fault prediction metrics: A systematic literature review. Information and Software Technology, 2013,55(8): 1397-1418. [doi: 10.1016/j.infsof.2013.02.009].
  • 4Akiyama E. An example of software system debugging. In: Proc. of the Int'1 Federation of Information Proc. Societies Congress. New York: Springer Science and Business Media, 1971. 353-359.
  • 5Halstead MH. Elements of Software Science (Operating and Programming Systems Series). New York: Elsevier Science Inc., 1977.
  • 6McCabe TJ. A complexity measure. IEEE Trans. on Software Engineering, 1976,2(4):308-320. [doi: 10.1109/TSE.1976.233837].
  • 7Chidamber SR, Kemerer CF. A metrics suite for object oriented design. IEEE Trans. on Software Engineering, 1994,20(6): 476-493. [doi: 10.1109/32.295895].
  • 8Basili VR, Briand LC, Melo WL. A validation of object-oriented design metrics as quality indicators. IEEE Trans. on Software Engineering, 1996,22(10):751-761. [doi: 10.1109/32.544352].
  • 9Subramanyam R, Krishnan MS. Empirical analysis of CK metrics for object-oriented design complexity: Implications for software defects. IEEE Trans. on Software Engineering, 2003,29(4):297-310. [doi: 10.1109/TS E.2003.1191795].
  • 10Zhou YM, Xu BW, Leung H. On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. Journal of Systems and Software, 2010,83(4):660-674. [doi: 10.1016/j.jss.2009.11.704].

共引文献155

同被引文献35

引证文献5

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部