期刊文献+

基于AST的程序代码相似性度量研究 被引量:6

Research on Similarity Measure for AST-Based Program Codes
下载PDF
导出
摘要 针对程序代码相似性检测度量忽略程序语义、出现无效度量的问题,提出一种基于抽象语法树(AST:Abstract Syntax Tree)的程序代码相似性度量方法。通过预处理去除生成AST时的冗余信息,再进行词法语法分析,得到相应的AST;然后通过自适应阈值的选取方式,利用AST遍历得到的程序属性、方法序列,对AST进行相似度计算,最终判定是否抄袭,生成相似度检测报告。实验结果表明,该方法能有效检测Java程序代码的多种抄袭行为。 In order to solve the program code similarity detection measurement which ignores the program semantics and the invalid measurement, we present an AST( Abstract Syntax Tree) based on the program code similarity measure method. Through the pretreatment redundancy removal in AST generation and the lexieal grammar analysis, get the corresponding AST; and then according to the adaptive threshold method, using the AST traversal which include the sequence and process attributes to take the similarity calculation; finally, determine whether plagiarism and generate the test report. The experimental results show that this method can effectively detect a variety of plagiarism java code.
出处 《吉林大学学报(信息科学版)》 CAS 2015年第1期99-104,共6页 Journal of Jilin University(Information Science Edition)
基金 吉林省科技厅自然科学基金资助项目(20130101060JC) 吉林省教育厅"十二五"科学技术研究基金资助项目(2014132 2014125)
关键词 相似性度量 抽象语法树 相似度 自适应阈值 similarity measurement abstract syntax tree (AST) similarity adaptive threshold
  • 相关文献

参考文献11

  • 1SHEARD J, DICK M, MARKHAM S, et al. Cheating and Plagiarism: Perceptions and Practices of First Year IT Students [C]//Proccedings of the 7th Annum SIGCSE Conference on Innovation and Technology in ComputerScience Education. New York: ACM Press, 2002: 183-157.
  • 2GEORGINA C, MIKE J. Source-Code Plagiarism: AUK Academic Perspective [ R]. Warks: Department of Computer Scienee, University of Warwick, 2006.
  • 3YAMAMOTO T, MATSUSHITA M. Measuring Similarity of Large Software Systems Based on Source Code Correspondence [ D]. Osaka: Division of Software Science, Graduate School of Engineering Science, Osaka University, 2002: 4-5.
  • 4熊浩,晏海华,郭涛,黄永刚,郝永乐,李舟军.代码相似性检测技术:研究综述[J].计算机科学,2010,37(8):9-14. 被引量:23
  • 5刘云龙.基于Token的结构化匹配同源性代码检测技术研究[J].计算机应用研究,2014,31(6):1841-1845. 被引量:6
  • 6MICHAEL J WISE. String Similarity via Greedy String Tiling and Running Karp-Rabin Matching [ D]. Sydney: Department of Computer Science, University of Sydney, 1993.
  • 7MICHAEL J WISE. Neweyes: A System for Comparing Biological Sequences Using the Running Karp-Rabin Greedy String Tiling Algorithm [ C ]//Third International Conference on Intelligent Systems for Molecular Biology. Cambridge, England: [s. n. ] , 2006: 393-401.
  • 8赵长海,晏海华,金茂忠.基于编译优化和反汇编的程序相似性检测方法[J].北京航空航天大学学报,2008,34(6):711-715. 被引量:28
  • 9AIKEN A MOSS: A System for Detecting Software Plagiarism [ EB/OL ]. (2009-02-01). [ 2012-10-08 ]. http :// theory. stanford, edu/: aiken/moss/.
  • 10SAUL SCHLEIMER, DANIEL S WILKERSON, ALEX AIKEN. Winnowing: Local Algorithms for Documemt Fingerprinting [ C] ]JACM SIGMOD 2003. San Diego: ACM Press, 2003: 204-212.

二级参考文献98

  • 1曹羽中,金茂忠,刘超.克隆代码检测技术综述[J].计算机工程与科学,2006,28(z2):9-13. 被引量:6
  • 2Bilenko M,Mooney R J.Adaptive duplicate detection using learnable string similarity measure[C] ∥Proceeding of ninth ACM SIGKDD international conference on Knowledge disco-very and data mining.2003:39-48,.
  • 3Baker B S.On finding duplication and near duplication in large software systems[C] ∥Proceedings of 2nd Working Conference on Reverse Engineering.1995:86-95.
  • 4Mayrand J,Leblanc C,Merlo E M.Automatic detection of function clones in a software system using metrics[C] ∥Proceeding of International Conference on Software Maintenance (ICSM).1996.
  • 5Rieger M.Effective clone detection without language barriers[D].Bern University,Switzerland,2005.
  • 6Georgina C,Mike J.Source-code plagiarism:A UK academic perspective[R].RR-422.Department of computer Science,University of Warwick,2006.
  • 7Sheard J,Dick M,Markham S,et al.Cheating and plagiarism:perceptions and practices of first year it students[C] ∥Procee-dings of the 7th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education.2002:183-187.
  • 8McCabe D.Levels of Cheating and Plagiarism Remain High[OL].Center for Academic Integrity.Duke University,2005.http://academicintegrity.org.
  • 9Halstead,Howard M.Elements of Software Science[Z].Elsevier,1977.
  • 10Ottenstein K J.An Algorithmic Approach to the Detection and Prevention of Plagiarism[J].SIGCSE Bulletin,1977,8(4):30-41.

共引文献46

同被引文献36

引证文献6

二级引证文献43

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部