摘要
在软件功能开发过程中,会存在待开发功能在市面上已存在或者有相似的情况,为了节省开发成本,程序员普遍会选择用代码复用的方式解决项目"冷启动"问题,这也降低了开发成本。换言之,代码复用已经逐渐变成行业所接受的开发模式。但其中仍然存在诸多问题和安全隐患,如恶意软件代码、缺失许可认证等,所以代码安全监测是软件安全发展的重要手段。文章基于复用代码监测,从代码之间语义特征角度出发,设计了一种精准的袪码特征提取算法,并在此基础上实现二进制代码复用检测。实验结果表明,二进制工业软件溯源方法可以完成代码复用检测工作,并且在文件级、函数级维度都体现出良好的准确性。
In the process of software function development,there may be situations where the functions to be developed already exist on the market or have similar situations.In order to save development costs,programmers generally choose to use code reuse to solve the"cold start"problem of the project,and reduce the cost of development time and resource costs.In other words,code reuse has gradually become the development model accepted by the industry.However,there are still many problems and security risks,such as malware code,lack of license certification,etc.,so code security monitoring is an important means of software security development.Based on the existing reuse code monitoring,this paper designs an accurate weight feature extraction algorithm from the perspective of semantic features between codes,and implements binary code reuse detection on this basis.The experimental results show that the proposed binary industrial software traceability method can complete the code reuse detection work,and shows good accuracy at the file level and function level.
作者
付修锋
贾张涛
杨铁湃
安恒
金玉川
耿宏伟
FU Xiufeng;JIA Zhangtao;YANG Tiepai;AN Heng;JIN Yuchuan;GENG Hongwei(Beijing Institute of Computer Technology and Application,Beijing 100854,China)
出处
《计算机应用文摘》
2022年第13期89-92,共4页
Chinese Journal of Computer Application
关键词
开源软件
程序对比
代码溯源
NLP
代码复用
open source software
program comparison
code traceability
NLP
code reuse