期刊文献+

融入注意力机制的越南语组块识别方法 被引量:1

Vietnamese Chunk Identification Incorporating Attention Mechanism
下载PDF
导出
摘要 对于越南语组块识别任务,在前期对越南语组块内部词性构成模式进行统计调查的基础上,该文针对Bi-LSTM+CRF模型提出了两种融入注意力机制的方法:一是在输入层融入注意力机制,从而使得模型能够灵活调整输入的词向量与词性特征向量各自的权重;二是在Bi-LSTM之上加入了多头注意力机制,从而使模型能够学习到Bi-LSTM输出值的权重矩阵,进而有选择地聚焦于重要信息。实验结果表明,在输入层融入注意力机制后,模型对组块识别的F值提升了3.08%,在Bi-LSTM之上加入了多头注意力机制之后,模型对组块识别的F值提升了4.56%,证明了这两种方法的有效性。 For the Vietnamese chunk identification task,this paper proposes two ways to integrate the attention mechanism with the Bi-LSTM+CRF model.The first is to integrate the attention mechanism at the input layer,which allows the model to flexibly adjust weights of word embeddings and POS feature embeddings.The second is to add a multi-head attention mechanism on the top of Bi-LSTM,which enables the model to learn weight matrix of the Bi-LSTM outputs and selectively focus on important information.Experimental results show that,after integrating the attention mechanism at the input layer,the F-value of Vietnamese chunk identification is increased by 3.08%;and after adding the multi-head attention mechanism on the top of Bi-LSTM,the F-value of Vietnamese chunk identification is improved by 4.56%.
作者 王闻慧 毕玉德 雷树杰 WANG Wenhui;BI Yude;LEI Shujie(Luoyang Division,Information Engineering University,Luoyang,Henan 471003,China;College of For&gn Language and Literature,Fudan University,Shanghai 200433,China)
出处 《中文信息学报》 CSCD 北大核心 2019年第12期91-100,共10页 Journal of Chinese Information Processing
关键词 越南语 组块识别 Bi-LSTM+CRF模型 注意力机制 Vietnamese chunk identification Bi-LSTM+CRF model attention mechanism
  • 相关文献

参考文献6

二级参考文献30

  • 1黄德根,王莹莹.基于SVM的组块识别及其错误驱动学习方法[J].中文信息学报,2006,20(6):17-24. 被引量:6
  • 2周强.汉语语料库的短语自动划分和标注研究.北京大学博士研究生学位论文[M].-,1996..
  • 3赵军.汉语基本名词短语识别及结构分析研究.清华大学工学博士学位论文[M].-,1998..
  • 4孙宏林.现代汉语非受限文本的实语块分析.北京大学博士研究生学位论文[M].-,2001..
  • 5KUDOH T,MATSUMOTO Y. Chunking with support vector machines[C]//Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies. Stroudsburg, PA: Association for Computational Linguistics, 2001 : 1-8.
  • 6LAFFERTY J D,McCALLUM A ,PEREIRA F C N. Conditional random fields:probabilistic models for segmenting and labeling sequence data [C]//Proceedings of the Eighteenth International Conference on Machine Learning. San Francisco, CA: Morgan Kaufmann Publishers Inc, 2001 : 282-289.
  • 7STEVEN A. Partial parsing via finite-state cascades [J]. Natural Language Engineering, 1996,2(4):337-344.
  • 8SANG E F T K,BUCHHOLZ S. Introduction to the CoNLL-2000 shared task :chunking[C]//Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning:vol 7. Stroudsburg ,PA : Association for Computational Linguistics, 2000: 127-132.
  • 9BRILL E. Transformation-based error-driven learning and natural language processing:a case study in part-of-speech tagging[J]. Computational Linguistics, 1995,21 (4) : 543-565.
  • 10SANG E F T K,VEENSTRA J. Representing text ehunks[C]//Proceedings of the Ninth Conference on European Chapter of the Association for Computational Linguistics. Stroudsburg,PA:Association for Computational Linguistics, 1999.. 173-179.

共引文献103

同被引文献5

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部