期刊文献+

完全自注意力融合多元卷积的中文命名实体识别研究 被引量:1

Research on Chinese Named Entity Recognition Based on Complete Self Attention Fusion and Multivariate Convolution
下载PDF
导出
摘要 随着中文推广程度的扩大和使用汉语人数的增加,互联网的数据信息多以中文居多,但其词类和词性的判断往往需要以整个句子的含义为前提进行辨别,一词多意和一词歧义常会影响到人们的信息提取。研究构建起完全自注意力编码机制并将其融入多元卷积的神经层中,探究中文命名实体在模型识别中的表现机制和影响因素,对该融合模型进行了性能分析和仿真研究,实验表明,融合后的模型相较于传统常规的翻译模型更靠近PR曲线右上角的最优点,其在训练样本数达到了第50次时,其运行状态已趋于平稳状态,远远超过了其他模型运行次数的50%。同时完全自注意力机制使得信息提取下的F值达到了84.23,在加入伪数据和偏旁部首的干扰后,混合模型对信息的提取并未受到较大的影响,其F值仍然达到了86.56,整体模型的运行速度和收敛速度较好。研究两者融合下的模型识别中文的应用效果,有助于提高机器对信息分辨和提取的能力,并为其后续编码信息提取和改进提供新的可研究思路。 With the expansion of the promotion of Chinese and the increase of the number of people using Chinese,most of the data information on the internet is Chinese,but the judgment of part of speech and part of speech often needs to be based on the meaning of the whole sentence.Polysemy and ambiguity of one word often affect people's information extraction.This paper constructs a complete self attention coding mechanism and integrates it into the neural layer of multivariate convolution,explores the performance mechanism and influencing factors of Chinese named entities in model recognition,and carries out performance analysis and simulation research on the fusion model.The experiments show that the fused model is closer to the top right corner of PR curve than the traditional conventional translation model.When the number of training samples reaches the 50th time,its running state has become stable,far exceeding 50%of the running times of other models.At the same time,the complete self attention mechanism makes the F value of information extraction reach 84.23.After adding the interference of pseudo data and partial radicals,the hybrid model has no great impact on information extraction,and its F value still reaches 86.56.The running speed and convergence speed of the overall model are good.The research on the application effect of Chinese model recognition under the fusion of the two will help to improve the machine's ability to distinguish and extract information,and provide a new research idea for its subsequent coding information extraction and improvement.
作者 王宗泽 张吴波 WANG Zongze;ZHANG Wubo(Department of Electronic Information,Hubei Institute of Automotive Industry,Shiyan Hubei 442002,China)
出处 《佳木斯大学学报(自然科学版)》 CAS 2022年第5期34-38,共5页 Journal of Jiamusi University:Natural Science Edition
关键词 完全自注意力 多元卷积 中文命名实体 编码 信息提取 complete self attention multivariate convolution Chinese named entity code information extraction
  • 相关文献

参考文献15

二级参考文献57

共引文献189

同被引文献1

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部