摘要
软件缺陷预测是提高软件质量的有效途径。为应对软件缺陷数据的不平衡分布和特征冗余等问题,提出一种改进的基于半监督集成学习的软件缺陷预测方法SSFSAdaBoost(Semi-supervised software defect prediction based on sampling,feature selection and AdaBoost)。首先对训练集进行混合采样,其次使用SMA优化算法对采样后的训练集和测试集做特征选择,最后使用改进的半监督算法SUDAdaBoost进行集成。实验在三种公共数据集上进行验证,实验结果表明,该方法优于初始的Adaboost算法,并对缓解类不平衡问题具有良好的性能。
Software defect prediction is an effective way to improve software quality.In order to solve the problems of unbalanced distribution and feature redundancy of software defect data,an improved software defect prediction method SSFSAdaBoost(semi supervised software defect prediction based on sampling,feature selection and AdaBoost)based on semi supervised ensemble learning is proposed.Firstly,the training set is mixed sampled,then the SMA optimization algorithm is used to select the features of the sampled training set and test set,and finally the improved semi supervised algorithm SUDAdaBoost is used for integration.Experiments are carried out on three public data sets.The experimental results show that this method is superior to the initial AdaBoost algorithm,and has good performance in alleviating class imbalance problems.
作者
张莹
朱丽娜
ZHANG Ying;ZHU Lina(School of Electronics and Information Engineering,Huaibei Institute of Technology,Huaibei 235000;School of Physics and Electronic Information,Huaibei Normal University,Huaibei 235000;School of Information and Statistics,Guangxi University of Finance and Economics,Nanning 530003)
出处
《计算机与数字工程》
2023年第10期2390-2394,共5页
Computer & Digital Engineering
基金
国家自然科学基金项目(编号:61562004,71862003)资助。
关键词
软件缺陷预测
半监督学习
集成学习
数据采样
特征选择
software defect prediction
semi supervised learning
ensenmble learning
data sampling
feature selection