期刊文献+

基于字词特征融合与BO-LightGBM的自动漏洞评估方法

Automatic vulnerability assessment method based on char-word feature fusion and BO-LightGBM
下载PDF
导出
摘要 针对目前对软件未知漏洞缺乏及时准确分析与自动评估分类的问题,提出一种字词特征融合与贝叶斯优化LightGBM(Bayesian Optimization of LightGBM,BO-LightGBM)的漏洞特征自动评估方法.首先,为减少软件未知漏洞描述中存在新术语造成的影响,通过使用字词特征融合的方法提取并融合漏洞描述信息中的字符与单词特征;其中为防止时间信息泄露,将数据按年份排列,使用时间交叉验证方式选取合适的数据集划分方式;其次,利用LightGBM算法通过特征统计确定最优特征的优势,使用该算法对漏洞的机密性、完整性等7个特性进行分类评估.为进一步提高准确度,加入贝叶斯优化器对LightGBM算法中的8个超参数进行优化调整.最后,通过美国国家通用漏洞数据库上的实验表明,字词特征融合算法能够结合漏洞描述信息中的单词与字符特征,对未知漏洞的分类评估具有更高的准确率.与其他集成学习算法相比,经过贝叶斯优化参数寻优的LightGBM算法,能够进一步发挥LightGBM算法优势,提高漏洞特征评估准确率. Aiming at the lack of timely and accurate analysis and automatic evaluation classification of unknown software vulnerabilities,a vulnerability feature automatic evaluation method based on word feature fusion and Bayesian optimization of LightGBM was proposed.Firstly,in order to reduce the influence of new terms in the unknown software vulnerability description,the character and word features in the vulnerability description information are extracted by using the word feature aggregation method.In order to prevent time information disclosure,the data are arranged by year,and the appropriate data set division method is selected by time cross validation method.Secondly,LightGBM algorithm is used to determine the advantage of the optimal feature through feature statistics,and the algorithm is used to classify and evaluate the seven characteristics of vulnerability,such as confidentiality and integrity.In order to further improve the accuracy,a Bayesian optimizer is added to optimize and adjust the eight hyperparameters in LightGBM algorithm.Finally,the experiment on the US National Common Vulnerability database shows that the fusion algorithm can combine the word and character features in the vulnerability description information,and has higher accuracy in the classification and evaluation of unknown vulnerabilities.In addition,compared with other integrated learning algorithms,LightGBM algorithm based on Bayesian optimization parameter optimization can further play the advantages of the LightGBM algorithm and improve the accuracy of vulnerability feature evaluation.
作者 张哲 王勇 ZHANG Zhe;WANG Yong(School of Computer Science and Technology,Shanghai University of Electric Power,Shanghai 200120,China)
出处 《微电子学与计算机》 2023年第7期27-35,共9页 Microelectronics & Computer
基金 国国家自然科学基金项目(61772327) 上海自然科学基金(20ZR1455900)。
关键词 自然语言处理 时间交叉验证 LightGBM 集成学习 贝叶斯优化 漏洞评估 Natural language processing Time cross validation LightGBM Integrated learning Bayesian optimization Vulnerability assessment
  • 相关文献

参考文献8

二级参考文献37

共引文献261

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部