期刊文献+

基于Token编辑距离检测克隆代码 被引量:13

Clone code detection based on Levenshtein distance of token
下载PDF
导出
摘要 针对当前Type-3克隆代码检测工具较少、效率偏低等问题,提出了一种基于Token的能有效检测Type-3克隆代码的检测方法。该方法同时能有效检测Type-1和Type-2克隆代码。首先将源代码Token化得到特定代码粒度的Token串,其次将所有Token串的定长子串进行映射,在对映射信息进行查询的基础上,利用编辑距离算法确定克隆对,然后通过并查集算法快速构建克隆群,最终反馈克隆代码信息。实现了原型工具FClones,利用基于代码突变的框架对工具进行了评价,并与领域内较优秀的两款工具Ni Cad及Sim Cad进行了对比。实验结果表明,FClones在检测三类克隆代码时查全率均不低于95%,查准率均不低于98%,能更好地检测Type-3克隆代码。 Aiming at the problems of less clone code detection tools and low efficiency for the current Type-3, an effective clone code detection method for Type-3 based on the levenshtein distance of token was proposed. Type-1, Type-2 and Type-3clone codes could be detected by the proposed method in an efficient way. Firstly, the source codes of a subject system were tokenized into some token sequences with specified code size. Secondly, each definite-sized substring of the token sequences was mapped with corresponding index. Thirdly, the clone pairs were built by the levenshtein distance algorithm and the clone groups were built by the disjoint-set algorithm on the basis of the mapping information query. Finally, the feedback information of clone codes were given. A prototype tool named FClones was implemented. It was evaluated by the code mutation-based framework and compared with two state-of-the-art tools Sim Cad and NiCad. The experimental results show that the recall of FCloens is equal to or greater than 95% and its precision is not lower than 98% in detecting all of these three types of clone codes. FClones can do better in detecting Type-3 clones than others.
出处 《计算机应用》 CSCD 北大核心 2015年第12期3536-3543,共8页 journal of Computer Applications
基金 国家自然科学基金资助项目(61363017 61462071) 内蒙古自然科学基金资助项目(2014MS0613) 内蒙古自治区硕士研究生科研创新基金资助项目(S20141013524) 内蒙古师范大学研究生科研创新基金资助项目(CXJJS14077)
关键词 克隆代码 克隆检测 编辑距离 Type-3 TOKEN clone code clone detection Levenshtein distance Type-3 token
  • 相关文献

参考文献19

  • 1ROY C K, ZIBRAN M F, KOSCHKE R. The vision of software clone management: past, present, and future (keynote paper) [ C]/! Proceedings of the 2014 IEEE Conference on Software Main- tenance, Reengineering and Reverse Engineering. Piscataway: IEEE_ 2014:18-33.
  • 2BELLON S, KOSCHKE R, ANTONIOL G, et al. Comparison and evaluation of clone detection tools [ J]. IEEE Transactions on Soft- ware Engineering, 2007, 33(9): 577-591.
  • 3ROY C K, CORDY J R. NICAD: Accurate detection of near-miss intentional clones using flexible pretty-printing and code normaliza- tion [ C]// Proceedings of the 16th IEEE Conference on Program Comprehension. Piscataway: IEEE, 2008: 172- 181.
  • 4史庆庆,张丽萍,尹丽丽,刘东升.基于后缀数组的克隆检测[J].计算机工程,2013,39(9):123-127. 被引量:7
  • 5KAMIYA T, KUSUMOTO S, INOUE K. CCFinder: a multilinguis- tic token-based code clone detection system for large scale source code [ J]. IEEE Transactions on Software Engineering, 2002, 28 (7) : 654 -670.
  • 6TOOMEY W. Ctcompare: code clone detection using hashed token sequences [ C]// Proceedings of the 6th IEEE International Work- shop on Software Clones. Piscataway: IEEE, 2012:92 -93.
  • 7LAVOIE T, MERLO E. Automated type-3 clone oracle using Leven- shtein metric [ C]// Proceedings of the 5th International Workshop on Software Clones. New York: ACM, 2011:34 -40.
  • 8WAHLER V, SEIPEL D, GUDENBERG J, et al. Clone detection in source code by frequent itemset techniques [ C]//Proceedings of the 4th International Workshop on Source Code Analysis and Manip- ulation. Piscataway: IEEE, 2004: 128-135.
  • 9DAVEY N, BARSON P, FIELD S, et al. The development of a software clone detector [ J]. International Journal of Applied Soft- ware Technology, 1995, 1(3/4): 219-236.
  • 10LAVOIE T, MERLO E. About metrics for clone detection [J/OL]// Electronic Communications of the EASST, 2014, 63. http://journal. ub. tu-berlin, de/eceasst/article/viewFile/923/915.

二级参考文献19

  • 1Kamiya T, Kusumoto S, Inoue K. CCFinder: A Multi-linguistic Token-based Code Clone Detection System for Large Scale Source Code[J]. IEEE Transactions on Software Engineering, 2002, 28(7): 654-670.
  • 2Deissenboeck F, Hummel B, Juergens E. Code Clone De- tection in Practice[C]//Proc. of the 32nd ACM/IEEE Inter- national Conference on Software Engineering. Cape Town, South Africa: ACM Press, 2010.
  • 3Roy C K, Cordy J R, Koschke R. Comparison and Evaluation of Code Clone Detection Techniques and Tools: A Qualitative Approach[J]. Science of Computer Programming, 2009, 74(7): 470-495.
  • 4Johnson J H. Identifying Redundancy in Source Code Using Fingerprints[C]//Proc. of Centre for Advanced Studies on Collaborative Research: Software Engineering. [S. 1.]: IBM Press, 1993.
  • 5Baker B S. On Finding Duplication and Near-duplication in Large Software Systems[C]//Proc. of the 2nd Working Conference on Reverse Engineering. Toronto, Canada: IEEE Press, 1995.
  • 6Li Zhenmin, Lu Shan, Myagmar S, et al. CP-Miner: Finding Copy-paste and Related Bugs in Large Scale Software Code[J] IEEE Transactions on Software Engineering, 2006, 32(3): 176-192.
  • 7Baxter I, Andrew Y, Moura L, et al. Clone Detection Using Abstract Syntax Trees[C]//Proc. of International Conference on Software Maintenance. Bethesda, USA: IEEE Press, 1998.
  • 8Mayrand J, Leblanc C, Merlo E. Experiment on the Automatic Detection of Function Clones in a Software System Using Metrics[C]//Proc. of International Conference on Software Maintenance. Monterey, USA: IEEE Press, 1996.
  • 9Davis I, Godfrey M. From Whence It Came: Detecting Source Code Clones by Analyzing Assembler[C]//Proc. of the 17th Working Conference on Reverse Engineering. Beverly, USA: IEEE Press, 2010.
  • 10Keivanloo I, Roy C K, Rilling J. Java Bytecode Clone Detection via Relaxation on Code Fingerprint and Semantic Web Reasoning[C]//Proc. of the 6th International Workshop on Sot:ware Clones. Zurich, Switzerland: [s. n.], 2012.

共引文献6

同被引文献74

  • 1孙梦璘,宋晓秋,巢翌.软件程序代码质量度量技术研究[J].计算机工程与设计,2006,27(2):325-327. 被引量:7
  • 2ROY C K, ZIBRAN M F, KOSCHKE R. The vision of software clone management: past, present, and future (Keynote paper) [ C] // Proceedings of the 2014 IEEE Conference on Software Main- tenance, Reengineering and Reverse Engineering. Piscataway, NJ: IEEE, 2014:18-33.
  • 3BELLON S, KOSCHKE R, ANTONIOL G, et al. Comparison and evaluation of clone detection tools [ J]. IEEE Transactions on Soft- ware Engineering, 2007, 33(9): 577-591.
  • 4PATE J R, TAIRAS R, KRAFT N A. Clone evolution: a systematic review [J]. Journal of Software: Evolution and Process, 2013, 25 (3): 261 -283.
  • 5BARBOUR L, KHOMH F, ZOU Y. An empirical study of faults in late propagation clone genealogies [ J]. Journal of Software: Evolu- tion and Process, 2013, 25(11): 1139-1165.
  • 6LAGUE B, PROULX D, MAYRAND J, et al. Assessing the bene- fits of incorporating function clone detection in a development process [ C]// Proceedings of the 1997 IEEE International Confer- ence on Software Maintenance. Piscataway, NJ: IEEE, 1997:314-321.
  • 7ANTONIOL G, VILLANO U, MERLO E, et al. Analyzing cloning evolution in the Linux kernel [ J]. Information and Software Technol- 9gy, 2002, 44(13): 755-765.
  • 8GODE N. Clone Evolution [M]. Berlin: Springer, 2011:3-4.
  • 9KIM M, SAZAWAL V, NOTKIN D, et al. An empirical study of code clone genealogies [ J]. ACM SIGSOFT Software Engineering Notes, 2005, 30(5) : 187 - 196.
  • 10KIM M, NOTKIN D. Using a clone genealogy extractor for under- standing and supporting evolution of code clones [ J]. ACM SIG- SOFF Software Engineering Notes, 2005, 30(4) : 1 -5.

引证文献13

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部