摘要
传统基于词向量表示的命名实体识别方法通常忽略了字符语义信息、字符间的位置信息,以及字符和单词间的关联关系。提出一种基于单词-字符引导注意力网络(WCGAN)的中文旅游命名实体识别方法,利用单词引导注意力网络获取单词间的序列信息和关键单词信息,采用字符引导注意力网络捕获字符语义信息和字符间的位置信息,增强单词和字符间的关联性与互补性,从而实现中文旅游文本中命名实体的识别。实验结果表明,WCGAN方法在ResumeNER和TourismNER基准数据集上的F值分别为93.491%和92.860%,相比Bi-LSTM+CRF、Char-Dense等方法识别效果更好。
The traditional Named Entity Recognition(NER) methods based on word vector usually neglect the character semantics of Chinese characters,the position information between characters,and the dependence between characters and words. To address the problem,this paper proposes a NER method based on Word-Character Guided Attention Network(WCGAN) for Chinese tourism texts. The method uses the Word-Guided Attention Network(WGAN)to obtain the sequence information between words and further capture the significant word information.The Character Guided Attention Network(CGAN)is used to obtain the information about character semantics and position between characters,and thus enhance the relevance and complementarity between words and characters to realize the recognition of named entities in Chinese tourism texts. Experimental results on the two benchmark datasets of ResumeNER and TourismNER show that the F values of the WCGAN method are 93.491% and92.860% respectively,and the proposed method has better recognition performance than Bi-LSTM+CRF,Char-Dense and other methods.
作者
西尔艾力·色提
艾山·吾买尔
王路路
吐尔根·依布拉音
马喆康
买合木提·买买提
Xieraili Seti;Aishan Wumaier;WANG Lulu;Tuergen Yibulayin;MA Zhekang;Maihemuti Maimaiti(College of Information Science and Engineering,Xinjiang University,Urumqi 830046,China;Xinjiang Key Laboratory of Multi-languange Information Technology,Xinjiang University,Urumqi 830046,China;College of Software,Xinjiang University,Urumqi 830046,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2021年第2期39-45,共7页
Computer Engineering
基金
国家自然科学基金(61262060,61662077)
国家重点研发计划(2017YFB1002103)
新疆维吾尔自治区重点实验室开放课题(2018D04019)。
关键词
命名实体识别
字符引导注意力网络
单词引导注意力网络
字符语义
信息互补
位置信息
Named Entity Recognition(NER)
Character Guided Attention Network(CGAN)
Word Guided Attention Network(WGAN)
character semantics
information complementary
location information