期刊文献+

半监督软件缺陷挖掘研究综述 被引量:6

Software Defect Mining Based on Semi-supervised Learning
下载PDF
导出
摘要 软件质量是计算机系统安全可靠运行的保障,而软件缺陷是导致软件质量低下的重要诱因。软件缺陷挖掘技术凭借其能够通过对软件代码及其相关数据进行分析建模,发现软件系统潜在的缺陷,已得到了软件质量保障领域的广泛关注。要准确发现软件模块中潜在的缺陷,需要利用大量带有缺陷情况标注的模块进行学习。然而,缺陷情况标注往往需要通过详细测试或人工代码检查获取,要消耗大量测试和人工资源,在实际应用中难以满足,这严重制约了软件缺陷挖掘的性能。针对这一问题,半监督学习技术被引入软件缺陷挖掘,通过对大量缺少标注的模块进行利用,辅助提升软件缺陷挖掘的性能。本文对半监督缺陷挖掘技术的研究现状进行综述。首先综述了软件缺陷挖掘研究现状,然后简要介绍了半监督学习的4种学习范式;最后系统梳理了基于半监督学习进行软件缺陷挖掘的多种方法与技术。 Software quality ensures the reliable running of the software system, and software defects reduce the quality of the software system. Software defects can be identified effectively by mining the codes as well as other related data, so the software defect mining technology has drawn significant attention in software quality assurance. To effectively identify potential software defects from the software modules, a large number of modules labeled as defective or non-defective information need to be collected for model construction. However, the labels of modules are usually obtained by extensive testing or manual code inspection, which consumes a huge amount of manpower and time. In practice, only a small number of labels can be collected, which seriously constrains the performance of defect identification. To solve this problem, the semi-supervised learning is introduced into software defect mining, thus the mining performance is improved by exploiting the large number of unlabeled modules. Here, the advances and the research status of semi-supervised software defect mining are reviewed and discussed extensively. Firstly, the existing studies on software defect mining is briefly review, and then the four major paradigms of semi-supervised learning are introduced. Finally, various methods and techniques on semi-supervised defeet mining are systematically summarized and reviewed.
作者 黎铭 霍轩
出处 《数据采集与处理》 CSCD 北大核心 2016年第1期56-64,共9页 Journal of Data Acquisition and Processing
基金 国家自然科学基金(61422304 61272217)资助项目 江苏省自然科学基金(BK20131278)资助项目
关键词 软件挖掘 机器学习 半监督学习 软件缺陷挖掘 software mining machine learning semi-supervised learning software defect mining
  • 相关文献

参考文献59

  • 1Sommerville I. Software engineering[M]. 9th Ed. Boston, MA, USA: Addison Wesley, 2010.
  • 2Huizinga D, Kolawa A. Automated defect prevention Best practices in software management[M]. New York, USA[John Wiley Sons, 2007.
  • 3Li Z, Zhou Y Y. PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code[C]//Proceedings of the 10th European Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering. Lisbon, Portugal: ACM, 2005:306 315.
  • 4Livshits B, Zimmermann T. DynaMine: Finding common error patterns by mining software revision histories[C]//Proceed- ings of the 10th European Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foun- dations of Software Engineering. Lisbon, Portugal: ACM, 2005:296-305.
  • 5Tao X, Acharya M, Thummalapenta S, et al. Improving software reliability and productivity via mining program source code [C]//Proceedings ot the 22nd IEEE International Symposium on Parallel and Distributed Processing. Miami, USA: IEEE, 2008:1-5.
  • 6Engler D, Chen D Y, Hallem S, et al. Bugs as inconsistent behavior: A general approach to inferring errors in systems code [C]//Proceedings of the 18th ACM Symposium on Operating System Principles. Banff, Alberta, Canada: Is. n. ], 2001,35 (3) :57-72.
  • 7Li Z, Lu S, Myagmar S, et al. CP-Miner: A tool for finding copy-paste and related bugs in operating system code[C]//Pro- ceedings of the llth USENIX Symposium on Operating Systems Design and Implementation. San Francisco, USA: USEN1X Association, 2004 : 289-302.
  • 8Fenton N, Bieman J. Software metrics A rigorous and practical approachEM]. Boca, Raton, USA[CRC Press, 2014.
  • 9Menzies T, Greenwald J, Frank A. Data mining static code attributes to learn defect predictors [J]. IEEE Transactions on Software Engineering, 2007,33 ( 1 ) 2-13.
  • 10Zimmermann T, Nagappan N. Predicting defects using network analysis on dependency graphs[C]//Proceedings of the 30th International Conference on Software Engineering. Leipzig, Germany: ACM, 2008:531 540.

二级参考文献56

  • 1Chapelle O,Scholkopf B,Zien A. Semi-Supervised Learning[M].Cambridge,ma:the Mit Press,2006.
  • 2Zhu X J. Semi-supervised Learning Literature Survey.Technical Report 1530[R].Department of Computer Sciences,University of Wisconsin at Madison,Madison,WI,2006.
  • 3Zhou Z H,Li M. Semi-supervised learning by disagreement[J].Knowledge and Information Systems,2010,(03):415-439.
  • 4Shahshahani B M,Landgrebe D A. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon[J].IEEE Transactions on Geoscience and Remote Sensing,1994,(05):1087-1095.
  • 5Miller D,Uyar H. A mixture of experts classifier with learning based on both labelled and unlabelled data[A].Cambridge,ma:the Mit Press,1997.571-577.
  • 6Nigam K,McCallum A K,Thrun S,Mitchell T. Text classification from labeled and unlabeled documents using EM[J].Machine Learning,2000,(2-3):103-134.
  • 7Blum A,Mitchell T. Combining labeled and unlabeled data with co-training[A].New York,USA:ACM,1998.92-100.
  • 8Joachims T. Transductive inference for text classification using support vector machines[A].San Francisco,CA,USA,Morgan Kaufmann Publishers Inc,1999.200-209.
  • 9Zhu X J,Ghahramani Z,Lafferty J. Semi-supervised learning using Gaussian fields and harmonic functions[A].Menlo Park,ca:aaai Press,2003.912-919.
  • 10Zhou Z H. Semi-supervised learning by disagreement[A].Piscataway,NJ:IEEE,2008.93.

共引文献86

同被引文献19

引证文献6

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部