摘要
党建领域知识图谱构建过程中使用传统的命名实体识别方法时,存在实体边界不清、实体词性多义等问题,导致存在识别准确率和效率低的问题。为此,本文提出一种融合树形概率和领域词典的BERT-BiLSTM-CRF实体识别模型。该模型在BERT中嵌入领域词典进行文本向量化表示;利用BiLSTM获取上下文语义特征;将树形概率应用到CRF层的转移概率计算中提高分词准确率。与基准模型在MSRA和自构建的语料库上进行实验对比,实验结果表明本模型在F1值、召回率、精确率3个指标上都能取得较好的效果。
When constructing a knowledge graph in the field of party building,the traditional named entity recognition(NER)methods often suffer from unclear entity boundaries and polysemy of entity terms,which lead to low recognition accuracy and effi‐ciency.To address these issues,this paper proposes a BERT-BiLSTM-CRF entity recognition model that integrates tree-like probability and a domain dictionary.The model involves embedding the domain dictionary into BERT for text vectorization,uti‐lizes BiLSTM to acquire contextual semantic features,and applies tree-like probability to the transition probability calculation in the CRF layer to enhance word segmentation accuracy.The experimental results on the MSRA and self-constructed corpora,compared with the baseline model,show that the proposed model achieves better performance in terms of F1-score,recall,and precision.
作者
赵盾
佘学兵
邬昌兴
ZHAO Dun;SHE Xuebing;WU Changxing(Jinshan College,Fujian Agriculture and Forestry University,Fuzhou 350002,China;Jiangxi University of Technology,Nanchang 330098,China;East China Jiaotong University,Nanchang 330013,China)
出处
《计算机与现代化》
2024年第9期91-94,共4页
Computer and Modernization
基金
国家自然科学基金地区科学基金资助项目(62266017)
江西省教育厅科技项目(GJJ2202608)。