期刊文献+

FCG-NNER:一种融合字形信息的中文嵌套命名实体识别方法

FCG-NNER: A Chinese nested named entity recognition method fused with glyph information
下载PDF
导出
摘要 基于跨度的模型是嵌套命名实体识别的主要方法,其核心是将实体识别问题转化为跨度分类问题。而在中文数据集中,由于中文单词不具有明显的分割符号,导致语义和边界信息不明确,进而造成中文嵌套命名实体识别效果不佳。为了解决这一问题,提出了融合字形信息的基于跨度的中文嵌套命名实体识别算法——FCG-NNER,首先通过卷积神经网络获取汉字的字形信息,其次通过交叉Biaffine双仿射解码层实现原文信息与字形信息融合,然后通过对角融合CNN层获取不同跨度之间的局部相互作用,最后将交叉Biaffine双仿射解码层的输出与对角融合CNN层的输出相加后输入到全连接层中,得到最终的预测结果。采用2个具有代表性的中文嵌套NER数据集(CMeEE和CLUENER2020)用于实验验证。结果显示,FCG-NNER在CMeEE数据集中的精度为65.02%,召回率为67.93%,F1值达到0.664 4;在CLUENER2020数据集中的精度为79.45%,召回率为82.33%,F1值达到0.808 6,证明FCG-NNER算法的性能明显超过2个数据集的基线。 The span-based model is the primary approach for nested named entity recognition,which is based on the principle of transforming from entity recognition to span classification.However,Chinese datasets characterized by no obvious word delimiters contain ambiguous semantic and boundary information,and thus cause a poor performance of Chinese nested named entity recognition.To address the problem,this paper proposes FCG-NNER,a span-based Chinese nested named entity recognition algorithm fused with glyph information.First,a convolutional neural network is employed to extract the glyph information of Chinese characters.Then,the original information and glyph information are fused by using the cross-biaffine bilinear decoding layer.A fusion CNN layer is utilized to capture local interactions between different spans.Finally,the sum of the output of the cross-biaffine bilinear decoding layer and that of the fusion CNN layer is treated as the input of the fully connected layer to obtain the final prediction results.Two representative Chinese nested named entity recognition datasets,CMeEE and CLUENER2020,are selected for verification.The results show FCG-NNER achieves an accuracy of 65.02%,a recall of 67.93%,and an F1-score of 0.6644 in the CMeEE dataset while it records an accuracy of 79.45%,a recall of 82.33%,and an F1-score of 0.8086 in CLUENER2020 dataset,demonstrating FCG-NNER algorithm clearly outperforms the baselines provided by the two datasets.
作者 陈鹏 马洪彬 周佳伦 李琳宇 余肖生 CHEN Peng;MA Hongbin;ZHOU Jialun;LI Linyu;YU Xiaosheng(Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering,China Three Gorges University,Yichang 443002,China;College of Computer and Information,China Three Gorges University,Yichang 443000,China)
出处 《重庆理工大学学报(自然科学)》 CAS 北大核心 2023年第12期222-231,共10页 Journal of Chongqing University of Technology:Natural Science
基金 国家重点研究发展计划资助项目(2016YFC0802500)。
关键词 中文嵌套命名实体识别 字形特征 跨度分类 特征融合 Chinese nested named entity recognition character-level features span classification feature fusion
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部