期刊文献+

多特征融合与注意力机制的中文文本关系抽取 被引量:3

Research on Extraction Method of Text Entity Relations Based on Multi-feature Fusion and Attention Mechanism
下载PDF
导出
摘要 在中文关系抽取任务中,数据稀疏和噪声传播问题是其研究难点。基于此,提出了在文本特征组织方面融合位置特征、最短依存特征和N-gram特征等多元特征,并提升关键性特征的权重,以缓解传统词特征的数据稀疏问题。这种组合特征进一步改善了文本中噪声传播问题,提高了句法特征在稀疏性问题下的可靠性。此外,在传统的双向LSTM神经网络中加入注意力机制,使模型更关注较为重要的特征,降低噪声对抽取任务的影响。在人物关系公开语料集上进行实验,结果表明采用该方法进行中文文本关系抽取的效果较好,并为信息抽取、知识图谱等领域提供了方法支持。 At present, data sparsity and noise propagation have become difficult problems in Chinese relational extraction. In order to alleviate the data sparsity problem of traditional word features, we propose to use the fusion of location features, minimum dependency features and N-gram features in text feature organization, and enhance the weight of key features. This combination feature further improves the problem of noise propagation in text, and improves the reliability of syntactic features under sparse problem. In addition, attention mechanism is added to the traditional two-way LSTM neural network to make the model pay more attention to the more important features and reduce the influence of redundant noise on the extraction task. Experiments on the open corpus of human relations show that the proposed method achieves good results in the task of Chinese text relational extraction, and provides methodological support for information extraction, knowledge mapping and other fields.
作者 陈振彬 叶颖雅 冯浩男 李明轩 陈珂 CHEN Zhenbin;YE Yingya;FENG Haonan;LI Mingxuan;CHEN Ke(College ofComputer Science and Technology, Guangdong University of Petrochemical Technology, Maoming 525000, China)
出处 《广东石油化工学院学报》 2019年第4期36-40,共5页 Journal of Guangdong University of Petrochemical Technology
基金 广东省自然科学基金项目(2016A030307049,2018A030307032) 广东省高等院校学科与专业建设专项资金项目(2016KTSCX090) 大学生创新创业训练与培育项目(733013,733435,733437)
关键词 依存句法分析 N-GRAM 关系抽取 双向LSTM 注意力机制 dependency parsing N-gram relation extraction BiLSTM attention mechanism
  • 相关文献

参考文献2

二级参考文献39

  • 1车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量:115
  • 2张素香,李蕾,秦颖,钟义信.基于Boot Strapping的中文实体关系自动生成[J].微电子学与计算机,2006,23(12):15-18. 被引量:3
  • 3董静,孙乐,冯元勇,黄瑞红.中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-85. 被引量:55
  • 4黄伯荣,廖序东.现代汉语[M].3版.北京:高等教育出版社,2002:12.
  • 5Kambhatla N. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations [C]//Proc of the ACL 2004 on Interactive Poster and Demonstration Sessions. Stroudsburg, PA: Association for Computational Linguistics, 2004:1-4.
  • 6Zhou G D, Su J, Zhang J, et al. Exploring various knowledge in relation extraction [C]//Proc of the 43rd Annual Meeting on Association for Computational Linguistics. Stroudsburg, PA.. Association for Computational Linguistics, 2005:427-434.
  • 7Jiang J, Zhai C X. A systematic exploration of the feature space for relation extraction [C]//Proc of Human Language Technologies: The Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT'07). Stroudsburg, PA: Association for Computational Linguistics, 2007:113-120.
  • 8Chan Y S, Roth D. Exploiting background knowledge for relation extraction [C]//Proc of the 23rd Int Conf on Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2010:152-160.
  • 9Sun A, Grishman R, Sekine S. Semi-supervised relation extraction with large-scale word clustering [C]//Proc of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: Association for Computational Linguistics, 2011, 1: 521-529.
  • 10Chen Z, Ji H. Language specific issue and feature exploration in Chinese event extraction [C] //Proc of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers. Stroudsburg, PA~ Association for Computational Linguistics, 2009: 209- 212.

共引文献91

同被引文献11

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部