期刊文献+

基于NLP及特征融合的漏洞相似性算法评估 被引量:2

Vulnerability Similarity Algorithm Evaluation Based on NLP and Feature Fusion
下载PDF
导出
摘要 漏洞相似性研究有助于安全研究人员从历史漏洞的信息中寻找新漏洞的解决方法。现有漏洞相似性研究工作开展不多,模型的选择也缺乏客观的实验数据支撑。文章将多种词嵌入技术与深度学习自编码器进行组合,从漏洞描述文本角度计算语义相似性。同时,结合从NVD等公共数据库提取的多维度特征数据,从漏洞特征角度计算漏洞特征相似性,并设计了一套基于NLP及特征融合的双角度漏洞相似性度量算法和评估方案。实验从数值分布、相似区分度和准确性等方面评估各种模型组合的效果,最优的模型组合在漏洞相似性判定中最高可获得0.927的F1分数。 The study of vulnerability similarity helps security researchers to find solutions to new vulnerabilities from historical vulnerability information.The existing work on vulnerability similarity is not much,and the selection of its model is also lack of objective experimental data support.On this basis,this paper combined various word embedding technologies and deep learning auto-encoders to calculate semantic similarity from the perspective of vulnerability description text.At the same time,multi-dimensional feature data were extracted from public databases such as NVD,to calculate vulnerability feature similarity from the perspective of vulnerability features,and finally a dual angle vulnerability similarity measurement algorithm and evaluation scheme based on NLP and feature fusion was designed.Based on objective experimental analysis,the effects of various model combinations were compared from the aspects of numerical distribution,similarity discrimination,accuracy,etc.The final optimized model combination can obtain the highest F1 score of 0.927 in the determination of vulnerability similarity.
作者 贾凡 康舒雅 江为强 王光涛 JIA Fan;KANG Shuya;JIANG Weiqiang;WANG Guangtao(School of Electronic and Information Engineering,Beijing Jiaotong University,Beijing 100044,China;Information Security Center,China Mobile Group Co.,Ltd.,Beijing 100053,China)
出处 《信息网络安全》 CSCD 北大核心 2023年第1期18-27,共10页 Netinfo Security
基金 教育部中国移动科研基金[MCM20200106]。
关键词 自然语言处理 深度学习 漏洞相似性 词嵌入 natural language processing deep learning vulnerability similarity word embedding
  • 相关文献

参考文献7

二级参考文献60

  • 1曹羽中,金茂忠,刘超.克隆代码检测技术综述[J].计算机工程与科学,2006,28(z2):9-13. 被引量:6
  • 2孟庆磊,姚春莲,宋建斌,李炜.一种面向H.264/AVC的快速帧内预测选择算法[J].北京航空航天大学学报,2007,33(2):219-223. 被引量:13
  • 3McCabe D. Levels of Cheating and Plagiarism Remain High[C/ OL]. Center for Academic Integrity, Duke University, 2005. http://academicintegrity. org/.
  • 4Bull J,Collins C,Coughlin E, et al. Technical Review of Plagiarism Detection Software Report [C/OL]. http://www. jisc. ac. uk/pub01/luton. pdf,July 2002.
  • 5Sheard J, Dick M, Markham S, et al. Cheating and plagiarism: perceptions and practices of first year IT students[C]//The 7th Annual Joint Conference on Innovation and Technology in Computer Science Education. Aarhus, Denmark, 2002 : 183-187.
  • 6Parker A, Hamblem J O. Computer algorithms for plagiarism detection[J]. IEEE Transactions on Education, 1989,32 (2) :94- 99.
  • 7Faidhi J A W,Robinson S K. An Empirical Approach for Detection Program Similarity and Plagiarism within a University Programming Environment[J]. Computers and Education, 1987,11 (1):11-19.
  • 8Jones E L. Metrics based plagiarism monitoring[C] // The 6th Annual CSSC Northeastern Conference. Middlebury, VT, 2001.
  • 9Verco K L, Wise M J. Software for detecting suspected plagiarism:comparing structure and attribute-counting systems[C]// Proceedings of the 1st Australian Conference on Computer Science Education. 1996:3- 5.
  • 10WISE M J. YAP3 : Improved Detection of similarities in computer program and other Texts[C]//ACM SIGCSE. 1996:130-134.

共引文献139

同被引文献14

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部