摘要
为提升涉密敏感信息管理工作智能化水平,该文提出一种BERT-BGRU-CRF深度学习方法,实现对涉密敏感信息的自动识别。该方法先是采用BERT模型对文本信息进行预处理,再采用双向门控循环单元(BGRU)模型获取上下文语义特征,最后将提取后的信息输入到条件随机场模型中进行序列标注,从而得到最优解。实验结果表明,在自建数据集上,所提方法相较于BERT-CRF、BERT-LSTM-CRF、BERT-BiLSTM-CRF三个识别方法,在精确率、召回率和F1值等方面均取得了较高的分数,证明该方法是适用于涉密敏感信息智能识别工作的。
To improve the intelligence level of sensitive information management work,this paper proposes a BERTBGRU-CRF Deep Learning method to achieve automatic recognition of sensitive information.This method first preprocesses the text information using the BERT model,then uses the Bidirectional Gated Recurrent Unit(BGRU)model to obtain contextual semantic features,and finally inputs the extracted information into the Conditional Random Field model for sequence annotation to obtain the optimal solution.The experimental results show that on the self-built dataset,the proposed method achieves higher scores in accuracy,recall,and F1 value compared to the three recognition methods BERT-CRF,BERT-LSTM-CRF,and BERTBiLSTM-CRF,proving that this method is suitable for intelligent identification of sensitive information.
作者
曾庆瑞
ZENG Qingrui(AECC Guiyang Engine Design Research Institute,Guiyang 550081,China)
出处
《现代信息科技》
2024年第11期171-175,共5页
Modern Information Technology
关键词
敏感信息识别
深度学习
门控循环单元
BERT
条件随机场
sensitive information recognition
Deep Learning
Gated Recurrent Unit
BERT
Conditional Random Field