摘要
为克服软件缺陷预测中的类不平衡问题,提出机器学习模型GA-FSVM。去除软件数据集的冗余特征,使用模糊支持向量机作为分类器,针对软件缺陷预测问题提出相应的模糊隶属度函数,使其能适应数据集的类不平衡,应对数据集中的特异点,使用遗传算法进行参数调优,训练分类器。在NASA数据集上进行交叉验证的结果表明,和几种常见的算法相比,该方法能够提高有缺陷样本的F-measure值。
To solve the class imbalance problem in software defect prediction,a machine learning model GA-FSVM was proposed.The redundant features of software data sets were removed,and fuzzy support vector machine was used as classifier.In addition,the corresponding fuzzy membership functions for software defect prediction were proposed,which not only adapted to the data set of class imbalance,but also dealt with outliner in data set,and genetic algorithm was used for parameter tuning.The results of cross validation on NASA datasets show that the proposed method can improve the F-measure value of defective samples compared with several common algorithms.
作者
程元启
姚淑珍
谭火彬
李丹丹
CHENG Yuan-qi;YAO Shu-zhen;TAN Huo-bin;LI Dan-dan(School of Computer Science and Engineering,Beihang University,Beijing 100191,China;School of Software,Beihang University,Beijing 100191,China)
出处
《计算机工程与设计》
北大核心
2018年第9期2753-2757,共5页
Computer Engineering and Design
关键词
软件缺陷预测
模糊支持向量机
类不平衡问题
遗传算法
机器学习
software defect prediction
fuzzy support vector machine
class imbalance
genetic algorithm
machine learning