期刊文献+

融合滑动窗口和哈希函数的代码漏洞检测模型 被引量:6

Code vulnerability detection model based on sliding window and hash function
下载PDF
导出
摘要 针对传统漏洞检测分类需要定义人工特征以及相似度匹配算法不能检测非克隆漏洞、现有深度学习漏洞检测的方法特征维度过大以及只针对函数调用的问题,提出一种融合滑动窗口和哈希函数的深度学习方法,对源代码进行静态漏洞检测分类。首先抽取源代码的方法体,形成正负样本集,对样本集中的每个样本构建抽象语法树,根据语法树中的节点类型替换程序员自定义的变量名以及方法名,并以先序遍历的方式序列化抽象语法树;然后对抽象语法树节点中的节点信息进行分词,为每个词分配一个独立的节点编号;其次对树节点进行进一步的拆分,形成词序列,基于滑动窗口与哈希函数训练出相应的漏洞检测分类模型。最后,在SARD数据集中选取CWE190整数上溢和CWE191整数下溢两类漏洞进行实验,该模型在CWE190、CWE191中的分类准确率和召回率分别达到97.4%、94.2%和97.6%、95.1%。实验结果表明,提出方法能够检测到代码中的安全漏洞类型,并且在分类准确率和召回率上优于现有的方法。 Aiming at the problem that traditional vulnerability detection classification requires the definition of artificial features,similarity matching algorithms cannot detect non-clonal vulnerabilities and there are large feature dimensions and only for function call in existing deep learning vulnerability detection methods,this paper proposed a deep learning method based on sliding window and hash function to perform static vulnerability detection and classification on source code. Firstly,it extracted the method body of the source code to form a positive and negative sample set constructed an abstract syntax tree for each sample,replaced the programmer-defined variable names and method names according to the node type in the syntax tree and serialized abstract syntax tree by preorder traversal. Then,it performed word segmentation on the node information in the abstract syntax tree node and assigned an independent node number for each word. Then,it further split the tree nodes to form a word sequence,and trained the vulnerability detection classification model based the sliding window and hash function. Finally,it selected two types of vulnerability data sets,i. e. CWE-190 and CWE-191,for experiments in the SARD data set. The accuracy and recall rate of the vulnerability detection classification model reach 97. 4% and 94. 2% for CWE-190 and 97. 6% and95. 1% for CWE-191 respectively. The results show that the model can effectively detect the types of security vulnerabilities in the code and it is superior to some existing methods.
作者 许健 陈平华 熊建斌 Xu Jian;Chen Pinghua;Xiong Jianbin(School of Computer,Guangdong University of Technology,Guangzhou 510006,China;School of Automation,Guangdong Polytechnic Normal University,Guangzhou 510665,China)
出处 《计算机应用研究》 CSCD 北大核心 2021年第8期2394-2400,共7页 Application Research of Computers
基金 广东省科技计划资助项目(2020B1010010010,2019B101001021) 广东省自然科学基金资助项目(2019A1515010700)。
关键词 静态代码漏洞检测 深度学习 滑动窗口 哈希函数 分类模型 static code vulnerability detection deep learning sliding window hash function classification model
  • 相关文献

参考文献2

二级参考文献10

共引文献28

同被引文献53

引证文献6

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部