摘要
针对通用领域的命名实体识别(Named Entity Recognition,NER)模型在红色文化的实体识别中难以完整准确地进行实体划分的问题,提出了一种基于双向长短期记忆(Bi-directional Long Short-Term Memory,BiLSTM)网络模型结合词汇增强和注意力机制方法的改进算法红色学习双向长短期记忆(Red Learing BiLSTM,RLBiLSTM)网络,用于红色文化的NER。对红色文化数据集中的重要词汇进行数据处理,构建一个包含红色文化特征的词表,将词表信息与BERT底层信息进行融合。使用BiLSTM网络和注意力机制考虑上下文和全局信息,并利用条件随机场进行实体识别。实验表明,将改进的算法应用于RedCulture-1数据集上取得了较好的识别效果,和传统的算法相比具有更高的准确率,有利于解决红色文化的实体识别问题。
Aiming at the problem that Named Entity Recognition(NER)models in general domain are difficult to partition entities completely and accurately in red culture entity recognition,an improved RLBiLSTM algorithm based on Bi-directional Long Short-Term Memory(BiLSTM)network model combined with vocabulary enhancement and attention mechanism methods is proposed for red culture named entity recognition.Firstly,data processing is performed on important words in the red culture data set,and a vocabulary containing red culture features is constructed to fuse the vocabulary information with the underlying information of BERT.Then,we use bidirectional long-term and short-term memory networks and attention mechanisms to consider contextual and global information.Finally,conditional random fields are used for entity recognition.Experiments show that applying the improved algorithm to the RedCulture-1 dataset has achieved good recognition results,and has a higher accuracy compared to traditional algorithms,which is conducive to solving the entity recognition problem of red culture.
作者
淳鑫
冯玲
李航
CHUN Xin;FENG Ling;LI Hang(School of Computer and Information Engineering,Nanning Normal University,Nanning 530100,China;School of Environmental and Life Engineering,Nanning Normal University,Nanning 530100,China;School of Artificial Intelligence,Guangxi Minzu University,Nanning 530006,China)
出处
《无线电通信技术》
2023年第4期622-628,共7页
Radio Communications Technology
基金
国家自然科学基金(61671252)。