期刊文献+

基于增强AST的图神经网络函数级代码漏洞检测方法 被引量:3

Function Level Code Vulnerability Detection Method of Graph Neural Network Based on Extended AST
下载PDF
导出
摘要 软件漏洞逐年递增,安全问题愈发严重。在软件项目的交付阶段对原始代码进行漏洞检测可以有效避免后期运行时的安全漏洞,而代码漏洞检测依赖于有效的代码表征。传统的基于软件度量的表征方法与漏洞关联性较弱,难以对漏洞信息进行有效表征。近年来,机器学习为漏洞的智能化发现提供了新的思路,但该方法同样可能遗漏关键的代码特征信息。针对以上问题,文中在传统抽象语法树(AST)上增加控制依赖、数据依赖和语句序列边生成增强抽象语法树(EXAST)图结构,对原始代码进行表征以更好地处理代码结构化信息,并采用词向量嵌入算法(Word2Vec)将代码信息初始化为机器能够识别和学习的数值向量。同时,在传统的图神经网络(GNN)中引入门控循环单元(GRU),构建图识别模型,以缓解梯度消失并加强图结构中长期信息的传播,从而增强了代码执行的时序关系,提高了漏洞检测的准确度。最后在SARD公开数据集上对模型进行对比测试,实现了函数粒度的代码漏洞检测,相比传统的漏洞检测方法,准确率和F1分值分别最大提高了32.54%和44.99,实验结果证明了所提方法对代码漏洞检测的有效性。 With the increase of software vulnerabilities year by year,security problems are becoming more and more serious.Vulnerability detection of original code in the delivery stage of software project can effectively avoid security vulnerabilities in later run-time,and the discovery of code vulnerability depends on effective code characterization.The traditional characterization me-thods based on software metrics have weak correlation with vulnerabilities,so it is difficult to characterize vulnerability information efficiently.In recent years,machine learning has provided a new idea for intelligent discovery of vulnerabilities,but this method also has the problem of missing key information of code feature.To solve the above problems,control flow edge,data flow edge and next token edge are added to the traditional abstract syntax tree(AST)to generate an expanded abstract syntax tree(EXAST)graph structure,characterize the original code to better process the code structure information,and the word vector embedding model(word2vec)is used to initialize the code information into a numerical vector that the machine can recognize and learn.At the same time,the gate recurrent unit(GRU)is introduced into the traditional graph neural network(GNN)to build the model,which can alleviate the disappearance of the gradient,enhance the dissemination of long-term information in the graph structure to strengthen the timing relationship of code execution and improve the accuracy of vulnerability detection.Finally,the model is trained and tested on the SARD data sets to realize the function granularity code vulnerability detection,which can improve the accuracy of 32.54%and the F1 score of 44.99 compared with the traditional vulnerability detection method.Experimental results confirm the effectiveness of the method for code vulnerability detection.
作者 顾守珂 陈文 GU Shouke;CHEN Wen(School of Cyber Science and Engineering,Sichuan University,Chengdu 610065,China)
出处 《计算机科学》 CSCD 北大核心 2023年第6期283-290,共8页 Computer Science
基金 国家重点研发计划(020YFB1805405,2019QY0800) 国家自然科学基金(U1736212,61872255,U19A2068) 模式识别与智能信息处理四川省高校重点实验室(MSSB-2020-01)。
关键词 漏洞挖掘 图神经网络 深度学习 抽象语法树 门控循环单元 Vulnerability mining,Graph neural network,Deep learning,Abstract syntax tree,Gate recurrent unit
  • 相关文献

同被引文献34

引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部