摘要
针对常用的软件缺陷预测模型缺乏可解释性及鲁棒性的问题,为了推断和理解软件缺陷预测中变量间的相关关系,研究了贝叶斯网络在软件缺陷预测中的应用方法,建立了贝叶斯网络软件缺陷预测模型及集成软件缺陷预测模型。使用数据离散化方法处理数据,采用贝叶斯网络结构学习算法确定网络结构及参数,并利用贝叶斯网络推断软件缺陷的概率分布;将贝叶斯网络与K近邻、决策树、逻辑回归等软件缺陷预测器以软投票的方式集成,建立集成软件缺陷预测模型;在6个公开的软件缺陷数据集上进行实验仿真。实验结果表明,与常用的集成软件缺陷预测模型相比所建立的基于贝叶斯网络的集成软件缺陷预测模型在F1、Recall、G-Mean评价指标上表现出了更好的预测性能。从因果分析的角度,为软件缺陷预测探索一条新的研究思路。
In response to the lack of interpretability and robustness in commonly used software defect prediction models,this study explores the application of Bayesian networks in software defect prediction to infer and understand the relationships between variables.Bayesian network software defect prediction models and integrated software defect prediction models are established.Firstly,data discretization methods are used to process the data.The Bayesian network structure learning algorithm is employed to determine the network structure and parameters,and Bayesian network inference is utilized to infer the probability distribution of software defects.Subsequently,Bayesian networks are integrated with K-nearest neighbors,decision trees,logistic regression,and other software defect predictors through soft voting to establish an integrated software defect prediction model.Finally,experiments are conducted on six publicly available software defect datasets.The experimental results demonstrate that the integrated software defect prediction model based on Bayesian networks exhibits better predictive performance in terms of F1,Recall,and G-Mean evaluation metrics compared to commonly used integrated software defect prediction models.From the perspective of causal analysis,this research explores a new research direction for software defect prediction.
作者
秦阳阳
张思鹏
郑越
韩阳
陈丽芳
QIN Yang-yang;ZHANG Si-peng;ZHENG Yue;HAN Yang;CHEN Li-fang(College of Science,North China University of Science and Technology,Tangshan Hebei 063210,China;Key Laboratory of Data Science and Application of Hebei Provincial,Tangshan Hebei 063210,China;Discipline Construction Department of North China University of Technology,Tangshan Hebei 063210,China)
出处
《华北理工大学学报(自然科学版)》
CAS
2024年第3期96-103,共8页
Journal of North China University of Science and Technology:Natural Science Edition
基金
国家自然科学基金面上项目:基于高炉冶炼过程大数据深度挖掘的炉温智能管控模型研究(编号:52074126)。
关键词
软件缺陷预测
贝叶斯网络
集成学习
因果分析
software defect prediction
Bayesian network
ensemble learning
causal analysis