摘要
二进制代码相似性分析技术用于实现二进制代码的相似性评估,从而对二进制代码的同源性进行推断,广泛应用于知识产权保护、漏洞搜索、补丁分析、恶意软件检测等领域。基于机器学习的二进制代码相似性分析技术具有准确率高、算法复杂度低、伸缩性好等优点,成为该领域的研究热点。从特征与模型两个方面,对近年来提出的基于机器学习的二进制代码相似性分析方法进行了综述,理清了近年来基于机器学习的二进制代码相似性分析技术的发展脉络,并对该领域的发展方向进行了分析与论述。
Binary code similarity analysis technology is used to evaluate the similarity of binary codes,so as to infer the homology between them,which is widely used in intellectual property protection,vulnerability search,patch analysis,malware detection and other fields.Binary code similarity analysis technology based on machine learning has the advantages of high accuracy,low algorithm complexity and good scalability,and is becoming a research hotspot in this field.This paper summarizes the binary code similarity analysis methods based on machine learning proposed in recent years from the aspects of features and models,and clarifies the development of binary code similarity analysis technology based on machine learning in recent years,and analyzes and discusses the development trend of this field.
作者
韩烨
孙治
赵童
王炳文
HAN Ye;SUN Zhi;ZHAO Tong;WANG Bingwen(China Electronic Technology Corporation Research Institute Co.,Ltd.,Baoding Hebei 071800,China;China Electronic Technology Cyber Security Co.,Ltd.,Chengdu Sichuan 610041,China;No.30 Institute of CETC,Chengdu Sichuan 610041,China)
出处
《通信技术》
2022年第9期1105-1111,共7页
Communications Technology
基金
科技创新2030—“新一代人工智能”重大项目(2020AAA0107804)
国家科技部重点研发计划(2019YFB2101701)
四川省重大科技专项(2022ZDZX0006,2017GZDZX0002)
四川省杰出青年基金(2019JDJQ0058)
四川省青年科技创新研究团队(2020JDTD0034)。
关键词
二进制代码
相似性评估
机器学习
软件供应链
binary code
similarity assessment
machine learning
software supply chain