期刊文献+

基于小样本学习的中文文本关系抽取方法

A relation extraction method for Chinese text based on few⁃shot learning
下载PDF
导出
摘要 实体关系抽取作为文本挖掘和信息抽取的核心任务,是知识图谱构建过程中的关键一环。然而人工建立大规模有标签的数据耗时耗力。使用小样本学习来进行关系抽取,仅仅需要少量样本实例就能使模型学会区分不同关系类型的能力,从而缓解大量无标签数据带来的标注压力。本文对中文关系抽取数据集FinRE进行了重构使之适用于少样本学习,并引入了语义关系网络HowNet对实体进行更为精确的语义划分,并在此基础上使用双重注意力机制提高句子编码质量,从而提高了模型在面对噪声数据时的效能,减轻了长尾关系的影响。使用本文的方法在该中文数据集进行了评估,与原始原型网络相比,基于句子级别与实体级别的注意力机制的原型网络在抽取准确率上提升了1%~2%的性能。 As the core task of text mining and information extraction,relation extraction is crucial to the knowledge graph construction process.Traditional relation extraction needs time⁃consuming and labor⁃intensive work to manually create large⁃scale labeled data;Yet few⁃shot learning needs a small number of sample instances to enable the model to learn the ability to distinguish between different relation types,thereby alleviating the labeling pressure brought by a large amount of unlabeled data.In this paper,the Chinese relation extraction dataset FinRE is reconstructed to fit the few⁃shot learning,and the semantic relation network HowNet is introduced to classify entities more accurately.Then,the dual⁃attention mechanism is used to improve the quality of sentence encoding,thereby improving the performance of the model in the face of noisy data and mitigating the impact of long⁃tailed relationships.Compared with the original prototype network,the proposed network based on sentence⁃level and entity⁃level attention mechanism improves the performance of extraction accuracy by 1%-2%.
作者 季一木 张旺 刘强 刘尚东 洪程 邱晨阳 朱金森 惠岩 肖婉 JI Yimu;ZHANG Wang;LIU Qiang;LIU Shangdong;HONG Cheng;QIU Chenyang;ZHU Jinseng;HUI Yan;XIAO Wan(School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;Institute of High Performance Computing and Bigdata,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;Nanjing Center of HPC China,Nanjing 210023,China;Jiangsu Research Engineering of HPC and Intelligent Processing,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;College of Educational Science and Technology,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)
出处 《南京邮电大学学报(自然科学版)》 北大核心 2023年第4期64-71,共8页 Journal of Nanjing University of Posts and Telecommunications:Natural Science Edition
基金 国家重点研发计划(2018AAA0103300,2018AAA0103302) 江苏省重点研发计划(SBE2023020143) 中电鸿信信息科技有限公司合作项目 江苏省自然科学基金(BK20170900) 江苏省高校自然科学研究重大项目(19KJB520046,20KJA520001) 江苏省创新创业人才项目 江苏博士后基金(2019K024) 江苏省博士后研究实践创新项目(KYCX19_0921,KYCX19_0906) 南京邮电大学鼎山人才培养对象项目 南京邮电大学引进人才科研启动基金(NY219132)资助项目。
关键词 小样本学习 关系抽取 BERT HOWNET 注意力机制 few⁃shot learning relation extracting BERT HowNet attention mechanism
  • 相关文献

参考文献8

二级参考文献76

共引文献350

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部