期刊文献+

基于多特征值的源代码相似性检测技术 被引量:1

Source Code Similarity Detection Technology Based on Multiple Eigenvalues
下载PDF
导出
摘要 在软件开发的过程中,开发人员通过复制粘贴式的开发方式或者模块化的开发方式来完成需求是十分常见的,这两种开发方式可以提高开发效率,但同时会导致软件系统中出现大量的相同代码或者相似代码,大量的相似代码会给软件维护等方面带来很大的困难,这也是最常见的重构对象。源代码相似性度量是指利用一定的检测方法分析程序源代码间的相似程度。该技术被应用于代码抄袭检测、代码克隆检测、软件知识产权保护、代码复用等多个领域。为了提高代码相似性度量的准确性,提出了一种基于多特征值的源代码相似性检测技术。构建了源代码注释、型构、代码文本语句与结构中特征提取的方法,并给出了源代码相似度检测的度量模型。通过与权威的代码相似检测系统Moss进行对比实验,结果表明该方法可以更准确地检测出相似代码。 In the process of software development,it is quite common for developers to complete the requirements through a copy-and-paste development method or modular development method.These two development methods can improve development efficiency,but at the same time,they will cause a large number of the same code or similar code.A large number of similar codes will cause great difficulties in software maintenance and other aspects,which is also the most common refactoring object.The source code similarity measuring refers to the use of certain detection methods to analyze the similarity between the source codes of the program.The source code similarity measurement technology can be applied in many areas such as code plagiarism detection,code cloning detection,software intellectual property protection and code reuse.To improve the accuracy of code similarity measures,we propose a source code similarity detection technology based on multiple eigenvalues.The feature extraction method is given for source code comment,code construction,code text statement,code structure,and a measurement model for source code similarity detection is provided.By comparing experiments with the authoritative code similar detection system Moss,the results show that the proposed method can detect similar codes more accurately.
作者 展佳俊 赵逢禹 艾均 ZHAN Jia-jun;ZHAO Feng-yu;AI Jun(School of Optoelectronic Information and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China)
出处 《计算机技术与发展》 2021年第1期103-109,共7页 Computer Technology and Development
基金 国家自然科学基金资助项目(61803264)
关键词 代码相似 代码抄袭 抽象语法树 代码特征提取 余弦相似度 code similarity code plagiarism abstract syntax tree code feature extraction cosine similarity
  • 相关文献

参考文献3

二级参考文献14

  • 1SHEARD J, DICK M, MARKHAM S, et al. Cheating and Plagiarism: Perceptions and Practices of First Year IT Students [C]//Proccedings of the 7th Annum SIGCSE Conference on Innovation and Technology in ComputerScience Education. New York: ACM Press, 2002: 183-157.
  • 2GEORGINA C, MIKE J. Source-Code Plagiarism: AUK Academic Perspective [ R]. Warks: Department of Computer Scienee, University of Warwick, 2006.
  • 3YAMAMOTO T, MATSUSHITA M. Measuring Similarity of Large Software Systems Based on Source Code Correspondence [ D]. Osaka: Division of Software Science, Graduate School of Engineering Science, Osaka University, 2002: 4-5.
  • 4MICHAEL J WISE. String Similarity via Greedy String Tiling and Running Karp-Rabin Matching [ D]. Sydney: Department of Computer Science, University of Sydney, 1993.
  • 5MICHAEL J WISE. Neweyes: A System for Comparing Biological Sequences Using the Running Karp-Rabin Greedy String Tiling Algorithm [ C ]//Third International Conference on Intelligent Systems for Molecular Biology. Cambridge, England: [s. n. ] , 2006: 393-401.
  • 6AIKEN A MOSS: A System for Detecting Software Plagiarism [ EB/OL ]. (2009-02-01). [ 2012-10-08 ]. http :// theory. stanford, edu/: aiken/moss/.
  • 7SAUL SCHLEIMER, DANIEL S WILKERSON, ALEX AIKEN. Winnowing: Local Algorithms for Documemt Fingerprinting [ C] ]JACM SIGMOD 2003. San Diego: ACM Press, 2003: 204-212.
  • 8Michael J Wise.String Similarity Via Greedy String Tiling and Running Karp-Rabin Matching. . 1993
  • 9G. Whale.Plague: Plagiarism Detection Using Program Structure. Dept. of Computer Science Technical Report 8805 . 1988
  • 10赵长海,晏海华,金茂忠.基于编译优化和反汇编的程序相似性检测方法[J].北京航空航天大学学报,2008,34(6):711-715. 被引量:28

共引文献8

同被引文献8

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部