摘要
使用神经网络进行漏洞检测的方案大多基于传统自然语言处理的思路,将源代码当作序列样本处理,忽视了代码中所具有的结构性特征,从而遗漏了可能存在的漏洞。提出了一种基于图神经网络的代码漏洞检测方法,通过中间语言的控制流图特征,实现了函数级别的智能化代码漏洞检测。首先,将源代码编译为中间表示,进而提取其包含结构信息的控制流图,同时使用词向量嵌入算法初始化基本块向量提取代码语义信息;然后,完成拼接生成图结构样本数据,使用多层图神经网络对图结构数据特征进行模型训练和测试。采用开源漏洞样本数据集生成测试数据对所提方法进行了评估,结果显示该方法有效提高了漏洞检测能力。
The schemes of using neural networks for vulnerability detection are mostly based on traditional natural language processing ideas,processing the code as array samples and ignoring the structural features in the code,which may omit possible vulnerabilities.A code vulnerability detection method based on graph neural network was proposed,which realized function-level code vulnerability detection through the control flow graph feature of the intermediate language.Firstly,the source code was compiled into an intermediate representation,and then the control flow graph containing structural information was extracted.At the same time,the word vector embedding algorithm was used to initialize the vector of basic block to extract the code semantic information.Then both of above were spliced to generate the graph structure sample data.The multilayer graph neural network model was trained and tested on graph structure data features.The open source vulnerability sample data set was used to generate test data to evaluate the method proposed.The results show that the method effectively improves the vulnerability detection ability.
作者
陈皓
易平
CHEN Hao;YI Ping(School of Cyber Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China)
出处
《网络与信息安全学报》
2021年第3期37-45,共9页
Chinese Journal of Network and Information Security
基金
国家重点研发计划(2019YFB1405000,2017YFB0802900)。
关键词
漏洞检测
图神经网络
控制流图
中间表示
vulnerability detection
graph neural network
control flow graph
intermediate representation