摘要
软件缺陷预测对于有效地维护软件产品的高质量非常有用。在软件缺陷预测中,机器学习技术近年获得到了广泛的关注。实际应用中,有缺陷的软件样本大大少于无缺陷的软件样本,造成样本的高度不平衡。不平衡数据分类问题是数据挖掘与机器学习领域的研究热点与难点。为了有效解决软件缺陷预测问题中的类别不平衡问题,提出了一种基于多段处理的软件缺陷分类算法。在10个美国国家航空航天局(NASA)的软件缺陷数据集上的实验结果(F-measure、AUC)验证了该分类算法的优越性与有效性。
Software defect prediction is very useful for effectively maintaining the high quality of software products.In software defect prediction,machine learning technology has received extensive attention in recent years.In practical applications,defective software samples are much less than non-defective software samples,resulting in a high degree of imbalance in the samples.The problem of imbalanced data classification is a research hot spot and difficulty in the field of data mining and machine learning.In order to effectively solve the problem of category imbalance in the software defect prediction problem,this paper proposes a software defect classification algorithm based on multi-stage processing.In this paper,the experimental results(F-measure,AUC)on ten software defect data sets of the National Aeronautics and Space Administration(NASA)verify the superiority and effectiveness of the classification algorithm.
出处
《工业控制计算机》
2021年第8期118-119,共2页
Industrial Control Computer