摘要
为降低软件缺陷预测对标注样本的需求,将软件缺陷预测问题模型化为PU场景下的迁移学习问题。不对要进行软件缺陷预测的目标领域样本标注类别,只对跨工程的源领域数据集标注部分正例样本,结合数据引力方法基于样本进行迁移学习,利用贝叶斯理论在源领域缺陷数据集和目标数据集上估算概率参数,构建软件缺陷预测算法TPAODE。实验结果表明,TPAODE算法比PNB和PTAN算法具有更好的缺陷预测性能,仅需标注少量正例样本的跨项目缺陷数据,即可具有较好的软件缺陷预测性能。
To reduce the requirement for labeled defect samples,the problem of software defect prediction was modeled into transfer learning problems in PU learning scenario.The target defect dataset was not labeled,while some of positive samples of source defect dataset were justly labeled,data gravity method was used to transfer cross-project defect samples into target dataset,and probability estimators were estimated based on source and target datasets,so as to construct the software defect prediction algorithm TPAODE.Experimental result shows that the TPAODE algorithm has better prediction performance than traditional PU learning methods PNB and PTAN.With only small amount of positive samples from cross-project defect dataset,the algorithm has excellent software defect prediction performance.
出处
《计算机工程与设计》
北大核心
2018年第3期663-667,共5页
Computer Engineering and Design
基金
国家自然科学基金项目(61602388)
中央高校基本科研业务费专项基金项目(2452015193
2452015194
2452016081)