摘要
文章提出一种基于ELECTRA-CRF的电信网络诈骗案件文本命名实体识别模型。该模型首先将标注后的语料输入ELECTRA模型,得到以字为颗粒度的状态转移特征;然后由CRF模型计算转移分数,判断当前位置与其相邻位置字符的实体标注组合;最后将该模型与BERT-CRF模型、RoBERTa-CRF模型进行对比。实验结果表明,文中模型在运算效率上明显优于其他两种深度学习模型,且准确度、召回率和调和平均值并未有太大损失,可以很好地应用于电信网络诈骗案件的命名实体识别中。
This paper proposes a text named entity recognition model of telecommunication network fraud crimes based on ELECTRA-CRF.Firstly,the annotated corpus is input into ELECTRA model to obtain the state transition features with Chinese characters as granularity.And then CRF model is used to calculate the transfer score to determine the entity label group of the character at the current position and its adjacent position.Finally,the BERT-CRF model and RoBERTa-CRF model are compared through experiments.The experimental results show that the text named entity recognition model proposed in this paper based on ELECTRA-CRF is significantly better than the other two deep learning models in operation efficiency,and the loss of the accuracy,recall rate and reconciliation average are very small.It can be well applied to the named entity recognition of telecommunication network fraud crimes.
作者
丁家伟
刘晓栋
DING Jiawei;LIU Xiaodong(College of Investigation,People’s Public Security University of China,Beijing 100038,China;College of Public Security and Traffic Management,People’s Public Security University of China,Beijing 100038,China)
出处
《信息网络安全》
CSCD
北大核心
2021年第6期63-69,共7页
Netinfo Security
基金
国家重点研发计划[2020YFC1522600]。