基于核方法的中文实体关系抽取研究被引量：18

A Study on Kernel-based Chinese Relation Extraction

下载PDF

导出

摘要命名实体关系抽取是信息抽取领域中的重要研究课题之一。该文探讨了核方法在中文关系抽取上的有效性问题,主要分为三部分:研究了在卷积树核中使用不同的语法树对关系抽取性能的影响;通过构造复合核检查了树核与平面核之间的互补效果;改进了最短路径依赖核,将核计算建立在原最短依赖路径的最长公共子序列上,以消除原始最短路径依赖核对依赖路径长度相同的过严要求。因为核方法开始被用于英文关系抽取时,F1值也只有40%左右,而我们在ACE2007标准语料集上的实验结果表明,只使用作用在语法树上的卷积核时,中文关系抽取的F1值达到了35%,可见卷积核方法对中文关系抽取也是有效的,同时实验也表明最短路径依赖核对中文关系抽取效果不明显。 Entity Relation Extraction is one of the important research fields in Information Extraction. This paper explores the effectiveness of two kernel-based methods, the convolution tree kernel and the shortest path dependency kernel, for Chinese relation extraction based on ACE 2007 corpus. For the convolution kernel, the influence by the different parse tree spans on the performance of relation extraction is studied. Then, experiments with composite kernels, which are a combination of the convolution kernel and feature-based kernels, are conducted to investigate the complementary effects between tree kernel and flat kernels. Finally, we improve the shortest path dependency kernel by replacing the strict same length requirement with finding the longest common subsequences between two shortest dependency paths. Experiments prove that kernel-based methods are effective for Chinese relation extraction as well.

作者黄瑞红孙乐冯元勇黄云平

机构地区中国科学院软件研究所

出处《中文信息学报》 CSCD 北大核心 2008年第5期102-108,共7页 Journal of Chinese Information Processing

基金国家自然科学基金资助项目(6077302760736044) 国家863计划重点资助项目(2006AA010108)

关键词计算机应用中文信息处理中文实体关系抽取核方法卷积树核复合核最短路径依赖核 computer application Chinese information processing Chinese relation extraction Kernel-based methods convolution tree kernel composite kernels shortest path dependency kernel

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献21

1Message Understanding Conferences (MUC) 1987- 1998. [DB/OL]http: //www. itl. nist. gov/iaui/894. 02/related_projects/muc/.
2NIST Automatic Content Extraction (ACE)2002- 2007. [DB/OL]http.. //projects. ldc. upenn, edu/ace/.
3Kambhatla Nanda. Combining lexical, syntactic and semantic features with Maximum Entropy models for extracting relations[C]//The Companion Volume to the Proceedings of 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain: 2004, 178-181.
4Zhou G. D. , Su J. , Zhang J. , and Zhang M.. Exploring Various Knowledge in Relation Extraction[C]// Proc of 43rd Annual Meeting of the Association for Computational Linguistics, University of Michigan, USA, 2005,427-434
5Zelenko D., Aone C. and Richardella A. Kernel Methods for Relation Extraction[J]. Journal of Machine Learning Research. 2003,(2) : 1083-1106.
6Culotta A. , Sorensen J. Dependency Tree Nernel ior Relation Extraction[C]//The Companion Volume to the Proceedings of 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain: 2004, 423-429.
7Bunescu R. C. and Mooney R. J. A Shortest Path Dependency Kernel for Relation Extraetion[C]//Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, Canada, 2005, 724-731.
8Zhang M, Zhang J, and Su J. Exploring syntactic features for relation extraction using a convolution tree kernel [C]// Proceedings of the Human Language Technology conference - North American chapter of the Association for Computational Linguistics annual meeting. New York, USA: 2006.
9ZhouG. D., Zhang M, JiDH, Zhu Q.M. Tree Kernel-based Relation Extraction with Context-Sensitive Structured Parse Tree Information[C]//Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, June 2007, 728-736.
10ZhangM., ZhangJ., SuJ. andZhouG. D. A Composite Kernel to Extract Relations between Entities with both Flat and Structured Features[C]//Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 2006, 825-832.

二级参考文献36

1钟义信.自然语言理解的全信息方法论[J].北京邮电大学学报,2004,27(4):1-12. 被引量：42
2车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量：116
3梁晗,陈群秀,吴平博.基于事件框架的信息抽取系统[J].中文信息学报,2006,20(2):40-46. 被引量：38
4In: Proceedings of the 6th Message Understanding Conference (MUC - 7) [ C ]. National Institute of Standars and Technology, 1998.
5C. Aone and M. Ramos-Santacruz. Rees: A large-scale relation and event extraction system[A]. In: Proceedings of the 6th Applied Natural Language Processing Conference[C] ,pages 76- 83, 2000.
6S. Miller, M. Crystal, H. Fox, L. Ramshaw, R. Schwartz, R. Stone, R. Weischedel, and the Annotation Group.Algorithms that learn to extract information-BBN: Description of the SIFT system as used for MUC[ A]. In: Proceedings of the Seventh Message Understanding Conference (MUC-7)[C], 1998.
7S. Soderland. Learning information extraction rules for semi-structured and free text[J]. Machine Learning, 1999. 34(1 - 3) :233 - 272.
8N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines[ M]. Cambridge University Press,Cambirdge University, 2000.
9T. Zhang. Regularized winnow methods[A]. In: Advances in Neural Information Processing Systems 13[C], pages703 - 709, 2001.
10D. Haussler. Convolution kernels on discrete structures[R]. Technical Report UCSC-CRL- 99- 10, 7, 1999.

共引文献137

1葛艳,杜坤钰,杜军威,陈卓.基于混合神经网络的实体关系抽取方法研究[J].中文信息学报,2021,35(10):81-89. 被引量：5
2吴婷,孔芳.基于图注意力卷积神经网络的文档级关系抽取[J].中文信息学报,2021,35(10):73-80. 被引量：12
3刘辉,江千军,桂前进,张祺,王梓豫,王磊,王京景.实体关系抽取技术研究进展综述[J].计算机应用研究,2020,37(S02):1-5. 被引量：25
4叶正,林鸿飞,苏绥,刘菁菁.基于支持向量机的人物属性抽取[J].计算机研究与发展,2007,44(z2):271-275. 被引量：11
5崔娜,雷涯邻,安海忠.面向用户需求的新闻文本集信息可视化模型[J].图书情报工作,2011,55(S2):273-279. 被引量：1
6李红亮,杨燕,尹红风,贾真.基于规则的百科人物属性抽取[J].集成技术,2013,2(3):1-4. 被引量：3
7黄毅,王庆林,刘禹.一种基于条件随机场的领域术语上下位关系获取方法[J].中南大学学报（自然科学版）,2013,44(S2):355-359. 被引量：5
8董静,孙乐,冯元勇,黄瑞红.中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-85. 被引量：55
9刘迁,焦慧,贾惠波.信息抽取技术的发展现状及构建方法的研究[J].计算机应用研究,2007,24(7):6-9. 被引量：41
10刘克彬,李芳,刘磊,韩颖.基于核函数中文关系自动抽取系统的实现[J].计算机研究与发展,2007,44(8):1406-1411. 被引量：59

同被引文献225

1车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量：116
2姜吉发,王树西.一种自举的二元关系和二元关系模式获取方法[J].中文信息学报,2005,19(2):71-77. 被引量：5
3何婷婷,徐超,李晶,赵君喆.基于种子自扩展的命名实体关系抽取方法[J].计算机工程,2006,32(21):183-184. 被引量：25
4董静,孙乐,冯元勇,黄瑞红.中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-85. 被引量：55
5刘迁,焦慧,贾惠波.信息抽取技术的发展现状及构建方法的研究[J].计算机应用研究,2007,24(7):6-9. 被引量：41
6Cohen AM, Hersh WR.A survey of current work in biomedieal text minin[J].Brief Bioinform, 2005,6(1):57- 71.
7MUC-7 EVALUATIONOF IE TECHNOLOGY:Overview of Results.[EB/OL].http://www.id.nist.gov/iad/894.02/rehted_pro jects/muc/proceedings/muc_7_proceedings/marsh_slides.pdf, 2009-02-18.
8The ACE 2004 Evaluation Plan:Evaluation of the Recognition of ACE Entities, ACE Relations and ACE Events. [EB/ OL]. http://www.nist.gov/speech/tests/ace/2004/doc/ace04 --evalplan-v7.pdf,2009-02-18.
9Automatic Content Extraction 2008 Evaluation Plan:Assessment of Detection and Recognition of Entities and Relations Within and Across Documents [EB/OL] . http://www.nist.gov/ speech/tests/ace/2008/dec/ace08 -evalplan.v1.2d.pdf,2009 - 02-18.
10Huang M,Zhu X, Hao Y, et al.Discovering patterns to extract protein-protein interactions from full texts[J].Bioinformatics, 2004,20(18): 3604-3612.

引证文献18

1张克菊,韩毅.关系抽取技术的发展与应用——以生物信息学为例[J].情报科学,2010,28(1):102-106. 被引量：1
2朱聪慧,赵铁军,韩习武,郑德权.动词次范畴英汉论元对应关系获取[J].中文信息学报,2010,24(2):91-95. 被引量：1
3虞欢欢,钱龙华,周国栋,朱巧明.基于合一句法和实体语义树的中文语义关系抽取[J].中文信息学报,2010,24(5):17-23. 被引量：19
4林如琦,陈锦秀,杨肖方,许红磊.多信息融合中文关系抽取技术研究[J].厦门大学学报（自然科学版）,2011,50(3):540-545. 被引量：2
5姚全珠,王美君,李如琼.基于子树特征的中文实体关系抽取[J].计算机工程,2012,38(1):48-50. 被引量：1
6李丽双,党延忠,张婧,王敏.基于组合核的中文实体关系抽取研究[J].情报学报,2012,31(7):702-708. 被引量：3
7刘丹丹,彭成,钱龙华,周国栋.词汇语义信息对中文实体关系抽取影响的比较[J].计算机应用,2012,32(8):2238-2244. 被引量：11
8陈鹏,郭剑毅,余正涛,线岩团,严馨,魏斯超.基于凸组合核函数的中文领域实体关系抽取[J].中文信息学报,2013,27(5):144-148. 被引量：7
9胡亚楠,舒佳根,钱龙华,朱巧明.基于机器翻译的跨语言关系抽取[J].中文信息学报,2013,27(5):191-197. 被引量：2
10黄勋,游宏梁,于洋.关系抽取技术研究综述[J].现代图书情报技术,2013(11):30-39. 被引量：24

二级引证文献142

1葛艳,杜坤钰,杜军威,陈卓.基于混合神经网络的实体关系抽取方法研究[J].中文信息学报,2021,35(10):81-89. 被引量：5
2吴天昊,古丽拉·阿东别克.基于神经元块级别注意力机制的LSTM关系抽取[J].计算机应用研究,2020,37(S02):76-79. 被引量：6
3刘辉,江千军,桂前进,张祺,王梓豫,王磊,王京景.实体关系抽取技术研究进展综述[J].计算机应用研究,2020,37(S02):1-5. 被引量：25
4王学锋,杨若鹏,贾明亮.基于循环神经网络的作战文书实体关系抽取[J].智能安全,2022,1(1):29-35.
5柏晓鹏.义类标注:必要性和可行性[J].语言学论丛,2020(1):116-134.
6刘春艳.基于信息可视化的文本挖掘研究领域前沿与演化分析[J].图书情报工作,2011,55(S2):270-272. 被引量：5
7唐公田.杏砧杏快速育苗新技术[J].科技致富向导,2000(4):26-26.
8李丽双,党延忠,张婧,王敏.基于组合核的中文实体关系抽取研究[J].情报学报,2012,31(7):702-708. 被引量：3
9刘丹丹,彭成,钱龙华,周国栋.词汇语义信息对中文实体关系抽取影响的比较[J].计算机应用,2012,32(8):2238-2244. 被引量：11
10彭成,钱龙华,赵知纬,周国栋.基于近似随机测试的语义关系抽取比较[J].计算机工程,2012,38(21):197-201.

1刘丹丹,彭成,钱龙华,周国栋.词汇语义信息对中文实体关系抽取影响的比较[J].计算机应用,2012,32(8):2238-2244. 被引量：11
2毕海滨,黄宇光.基于语义与SVM的中文实体关系抽取[J].福建电脑,2013,29(12):96-98. 被引量：2
3黄晨,钱龙华,周国栋,朱巧明.基于卷积树核的无指导中文实体关系抽取研究[J].中文信息学报,2010,24(4):11-17. 被引量：12
4徐庆,段利国,李爱萍,阴桂梅.基于实体词语义相似度的中文实体关系抽取[J].山东大学学报（工学版）,2015,45(6):7-15. 被引量：4
5王朔琛,汪西莉.参数自适应的半监督复合核支持向量机图像分类[J].计算机应用,2015,35(10):2974-2979. 被引量：3
6高源,席耀一,李弼程,杨静.基于卷积树核的事件论元角色抽取方法[J].小型微型计算机系统,2016,37(4):722-725. 被引量：1
7黄鑫,朱巧明,钱龙华,刘梅梅.基于特征组合的中文实体关系抽取[J].微电子学与计算机,2010,27(4):198-200. 被引量：19
8董静,孙乐,冯元勇,黄瑞红.中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-85. 被引量：55
9刘建舟,邵雄凯.基于语义核的中文实体关系抽取[J].信息系统工程,2011,24(3):94-95. 被引量：1
10赵君喆,何婷婷,李晶.一种无指导命名实体关系抽取研究[J].咸宁学院学报,2009,29(6):38-40.

中文信息学报

2008年第5期

浏览历史

内容加载中请稍等...

基于核方法的中文实体关系抽取研究被引量：18

参考文献21

二级参考文献36

共引文献137

同被引文献225

引证文献18

二级引证文献142

相关作者

相关机构

相关主题

浏览历史

基于核方法的中文实体关系抽取研究 被引量：18

参考文献21

二级参考文献36

共引文献137

同被引文献225

引证文献18

二级引证文献142

相关作者

相关机构

相关主题

浏览历史

基于核方法的中文实体关系抽取研究被引量：18