期刊文献+

融合知识的小片段代码相似性比较模型

Snippet code similarity model with integrated knowledge
下载PDF
导出
摘要 二进制小片段代码指令序列较短,基本块逻辑调用图结构简单,有限语义信息影响代码相似性比较结果,为此提出一种融合知识表示学习的二进制代码小片段相似性比较模型(BSM)。分别提取小片段代码的函数知识和函数代码,利用注意力机制和双向长短记忆得到知识嵌入,使用序列学习模型或图神经网络得到函数嵌入,融合知识嵌入和函数嵌入作为小片段代码向量表示。实验结果表明,BSM模型在跨平台比较上优于其它对比模型,说明模型能提升小片段代码比较的准确度。 The instructions sequence of binary snippet code is short,and the control flow graph of snippet code is simple,the limited semantic information carried on it affecting the accuracy of binary code similarity,a binary snippet code similarity model(BSM)was proposed which integrating knowledge embedding and code embedding to improve the accuracy.The knowledge and code of binary snippet were extracted separately,the knowledge embedding was obtained using attention mechanism and bidirectional long-short memory,the code embedding was obtained using leverage sequence learning model or graph neural network,and the fused knowledge embedding and code embedding were used as binary snippet code vector representation.Experimental results show that the BSM model outperforms other comparison models in cross-platform task,indicating that the model can improve the similarity comparison results of binary snippet code.
作者 夏冰 周鑫 庞建民 岳峰 单征 XIA Bing;ZHOU Xin;PANG Jian-min;YUE Feng;SHAN Zheng(School of Cybersecurity,Information Engineering University,Zhengzhou 450001,China;Frontier Information Technology Research Institute,Zhongyuan University of Technology,Zhengzhou 450007,China;Songshan Laboratory,Information Engineering University,Zhengzhou 450007,China)
出处 《计算机工程与设计》 北大核心 2023年第8期2360-2366,共7页 Computer Engineering and Design
基金 国家自然科学基金项目(61802435、61802433) 河南省高等学校重点科研基金项目(22B520054)。
关键词 二进制代码 跨平台 小片段比较 神经网络 自然语言处理 知识表示学习 代码表示学习 binary code cross-platform snippet code similarity neural network natural language processing knowledge representation learning code representation learning
  • 相关文献

参考文献1

二级参考文献7

共引文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部