期刊文献+

基于深度学习的威胁情报领域命名实体识别 被引量:2

Named Entity Recognition in Threat Intelligence Domain Based on Deep Learning
下载PDF
导出
摘要 为了从来源不同的威胁情报中提取关键信息,方便政府监管部门开展安全风险评估,针对威胁情报文本中英文混杂严重以及专业词汇生僻导致识别困难的问题,在BiGRU-CRF模型基础上,提出了一种融合边界特征以及迭代膨胀卷积神经网络(IDCNN)的威胁情报命名实体识别方法.该方法根据人工构造的规则词典将边界清晰的实体例如英文单词进行转化以减少模型在处理较长文本时容易造成的信息损失,通过IDCNN和双向门控循环单元(BiGRU)进一步提取了文本的局部和全局特征.通过在威胁情报语料库上进行实验,结果表明所提的方法模型在相关评价指标上均优于其他模型,F值达到87.4%. In order to extract key information of threat intelligence from different sources and facilitate the government regulatory authorities to carry out security risk assessment,to reduce the difficulty identification caused by the serious mixing of Chinese and English threat intelligence texts and the lack of professional vocabulary,based on BiGRU-CRF model,a threat intelligence named entity recognition(NER)method integrating boundary features and iterated dilated convolution neural network(IDCNN)is proposed.Firstly,entities with clear boundaries,such as English words,are transformed according to the artificially constructed rule dictionary to reduce the loss of information easily caused by the model when processing long texts.The local feature information and the context global feature information are obtained through IDCNN and bidirectional gated recurrent unit(BiGRU),respectively.The results of experiments on threat intelligence corpus show that the proposed model is better than other models in relevant evaluation indexes,and the F-score reaches 87.4%.
作者 王瀛 王泽浩 李红 黄文军 WANG Ying;WANG Ze-hao;LI Hong;HUANG Wen-jun(Henan International Joint Laboratory of Theories and Key Technologies on Intelligence Networks,Henan University,Kaifeng 475001,China;Subject Innovation and Intelligence Introduction Base of Henan Higher Educational Institution-Intelligent Information Processing Innovation and Intelligence Introduction Base of Henan University Software Engineering,Henan University,Kaifeng 475001,China;Institute of Intelligence Networks System,Henan University,Kaifeng 475001,China;Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100049,China)
出处 《东北大学学报(自然科学版)》 EI CAS CSCD 北大核心 2023年第1期33-39,共7页 Journal of Northeastern University(Natural Science)
基金 河南省自然科学基金资助项目(182300410164) 河南大学研究生教育创新与质量提升计划项目——英才计划(No.SYL19060120) 国家自然科学基金青年基金资助项目(61702503,61802016) 国家自然科学基金重点资助项目(Y810021104).
关键词 威胁情报 膨胀卷积 命名实体识别 信息抽取 深度学习 threat intelligence dilated convolution named entity recognition(NER) information extraction deep learning
  • 相关文献

参考文献4

二级参考文献17

  • 1张晓艳,王挺,陈火旺.命名实体识别研究[J].计算机科学,2005,32(4):44-48. 被引量:66
  • 2王昊.基于层次模式匹配的命名实体识别模型[J].现代图书情报技术,2007(5):62-68. 被引量:8
  • 3Roman Y, Ralph G. NYU: Description of the Proteus/PET System as Used for MUC- 7 ST[ C 1- In: Proceedings of the 7th Message Understanding Conference ( MUC - 7 ) , Faiffax, Virginia. 1998.
  • 4Fakamoto J, Shimohata M, Masui F. OKI Electric Industry: De- scription of the OKI System as Used for MET - 2 [ C ]. In : Proceed- ings of the 7th Message Understanding Conference ( MUC - 7 ) , Fairfax, Virginia. 1998.
  • 5General Architecture for Text Engineering[ EB/OL]. [ 2012 - 07 - 15 ]. http ://gate. ae. uk/. Adam B, Stephen A. A Maximum Entropy Approach to Natural Language Processing[ J ]. Computational Lingui'tics, 1996,22 ( 1 ) : 39 -71.
  • 6Rabiner L R, Juang B H. An Introduction to Hidden Markov Mod- els[Jl. IEEE ASSP Magazine, 1986, 3( 1 ) :4 - 16.
  • 7Rabiner L R, Juang B H. An Introduction to Hidden Markov Mod- els[J. IEEE ASSP Magazine, 1986, 3( 1 ) :4 - 16.
  • 8Chieu H L,Teow L N. Combining Local and Non - Local lnfimm- tion with Dual Decomposition for Named Entity Recognition fi'om Text[ C 1. In: Proceedings of the 15th ln, ternational ConJbrence on Information Fusion ( FUSION), Singapore. 2012:231 - 238.
  • 9Laferty J, McCallum A, Pereira F. Conditional Random Fieds: Probabilistic Models for Segmenting and Labe]ing Sequence Data [ C ]. In : Proceedings of the 18th International Conference on Ma- chine Learning. 2001:282 -289.
  • 10Developing Language Processing Components with GATE Versioo 7 (a User Guide) [EB/OL]. [2012 -07 -15]. http://gatc, ae. uk/sale/tao/split, html.

共引文献53

同被引文献10

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部