摘要
军事领域非结构化文本中的大量目标实体往往包含丰富的军事信息和军事知识,对其准确识别是进行军事信息抽取和军事知识组织的基础性关键任务,也是构建军事知识图谱的重要环节。针对军事领域标注数据不足以及军事实体边界模糊的问题,提出基于预训练BERT模型的深度学习识别方法。利用BERT生成基于当前输入语境特征的动态字向量来增强字的语义表示,融合字的含边界词性特征得到特征融合向量,再连接BiLSTM-CRF神经网络。在自建的军事领域标注数据集上的实验结果表明,相较于另外两种基准方法,该方法在准确率、召回率和F值上获得了更优的表现。
A large number of target entities in unstructured texts of the military field often contain a wealth of military information and military knowledge.Accurate identification of them is a fundamental and key task for military information extraction and military knowledge organization,as well as an important link in the construction of a military knowledge graph.To address the problems of insufficient annotation data in the military field and fuzzy boundaries of military entities,a deep learning recognition method based on the pre-trained BERT model is proposed.This method uses BERT to generate dynamic character vectors based on the current input context features to enhance the semantic representation of the characters,fuses the boundary part-of-speech features of the characters to obtain the feature fusion vectors,and then connects the BiLSTM-CRF neural network.The experimental results on the self-built military annotation data set show that this method has better performance in accuracy,recall and F-value,compared with the other two benchmark methods.
作者
张乐
李健
唐亮
易绵竹
ZHANG Le;LI Jian;TANG Liang;YI Mianzhu(Luoyang Campus,Information Engineering University,Luoyang 471003,China)
出处
《信息工程大学学报》
2021年第3期331-337,共7页
Journal of Information Engineering University