期刊文献+

基于后缀树的二进制可执行代码的克隆检测算法 被引量:2

Clone Detection Algorithm for Binary Executable Code with Suffix Tree
下载PDF
导出
摘要 如何发现代码克隆,是软件维护和软件侵权纠纷案件中的一个关键问题。由于商业保密等原因,在商业软件的侵权纠纷案中往往无法使用基于源代码比对的克隆检测技术。因此,针对这类无法获得源代码进行代码克隆检测的场景,文中提出一种针对二进制可执行文件分析的代码克隆检测方法。首先,通过反编译与指令类型抽象得到二进制可执行目标文件的指令类型序列;然后,对指令类型序列构建后缀树,利用后缀树的性质获取函数级的指令序列间的克隆信息,并通过消除沙砾指令进一步提高检测性能;最后,基于MIPS32指令集,使用Linux内核和经过混淆处理的代码分别作为克隆级别0-级别2与级别1-级别4的二进制可执行文件代码克隆测试样本,并与源代码检测工具进行对比测试。结果表明,所提算法在缺少源代码的场景下同样能进行细粒度的克隆分析,且对各级代码克隆均具有较好的检测性能。 How to detect code clones is an important issue in software maintenance and software infringements.Clone detection techniques based on source code tend to fail in the infringement disputes of commercial software due to trade secret.Therefore,in the scenario when the source code is unavailable for detection,this paper presented a clone detection algorithm based on binary executable file analysis.Firstly,instruction type sequences of binary executable files are obtained by decompilation instruction type abstraction,then a suffix tree is constructed based on these instruction type sequences.The clone pairs among functions can be figured out based on this suffix tree.In addition,this paper eliminated gravel instructions for enhancing performance.At last,based on MIPS32 instruction set,this paper used respectively Linux kernel and obfuscated test code as samples on clone level 0-level 2 and level 1-level 4 to compare with the source code detection tools.Test results show that even in the scenario where the source code is lacking,this algorithm can also perform fine-grained clone analysis and has high detection performance for code clones at all levels.
作者 张凌浩 桂盛霖 穆逢君 王胜 ZHANG Ling-hao;GUI Sheng-lin;MU Feng-jun;WANG Sheng(State Grid Sichuan Electric Power Research Institute,Chengdu 610000,China;School of Computer Science and Engineering,University of Electronic Science and Technology of China,Chengdu 611731,China;The 30th Institute of China Electronics Technology Group Corporation,Chengdu 610041,China)
出处 《计算机科学》 CSCD 北大核心 2019年第10期141-147,共7页 Computer Science
基金 国家自然科学基金(61401067) 国网四川省电力公司科技项目(521997170001P,521997170017)资助
关键词 代码克隆 二进制可执行文件 后缀树 性能优化 Code cone Binary executable file Suffix tree Performance optimization
  • 相关文献

参考文献3

二级参考文献21

  • 1史扬,曹立明,王小平.混淆算法研究综述[J].同济大学学报(自然科学版),2005,33(6):813-819. 被引量:12
  • 2Planetary Motion, Inc. v. Techsplosion, Inc., 261 F. 3d 1188, 1200 (11th Cir. 2001).
  • 3Lawrence Lessig, Huge and Important News, http://lessig. org/blog/2008/08/ huge_and_important_news_free_1. html.
  • 4Charles R. Macedo, Copying of Open Source Software in Violation of Artistic Licence was not Licensed, of Intellectual Property Law
  • 5Eric Steven Raymond, Licensing HOWTO, http://www. catb. org/-esr/ Licensing-HOWTO. html.
  • 6Open Source Definition Version 1.9, http://www. opensource. org/docs/ definition. php.
  • 7Sun Microsystems v. Microsoft Corp., 999 F. Supp. 1301 (N.D. Cal. 1998).
  • 8Graham v. James, 144 F. 3d 229, 236 (2d Cir. 1998).
  • 9S.O.S., Inc. v. Payday, Inc., 886 P. 2d 1081, 1087 (gth Cir. 1989).
  • 10http://www. opensource. org/licenses/artistic-license-1.0. php.

共引文献23

同被引文献26

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部