基于多特征实体消歧的中文知识图谱问答被引量：6

Chinese Knowledge Based Question Answering Based on Multi-feature Entity Disambiguation

下载PDF

导出

摘要问答系统应用于人工智能、自然语言处理和信息检索领域获得了较好的效果,知识图谱问答(KBQA)作为其中的重要组成部分,是一项极具挑战性的自然语言处理任务。然而,目前常见的中文KBQA系统对于实体链接的实体消歧部分并没有给出很好的解决方法。提出一种基于多特征实体消歧的中文KBQA系统,通过结合实体自身的知名度特征、问句与实体关系的语义相似度特征、问句与实体的字符相似度特征和语义相似度特征,构建多特征实体消歧模型,提高实体链接准确率,为系统的问句分类和最优路径选取部分提供更准确的主题实体,从而提升系统性能。实验结果表明,该系统在CCKS2019-CKBQA评测数据的验证集上平均F1值为72.08%,其中采用多特征消歧模型的实体链接准确率达到90.84%,较使用知名度消歧模型和评测大赛第1名分别提升6.35和0.11个百分点。 The application of question answering system to the fields of artificial intelligence,natural language processing and information retrieval has got excellent results.Knowledge Based Question Answering(KBQA)is an important part of question answering,and is a challenging natural language processing task.The commonly used Chinese KBQA systems do not provide a satisfying entity disambiguation solution for entity linking.To address the problem,this paper proposes a Chinese KBQA system based on multi-feature entity disambiguation.It jointly utilizes the entity’s own popularity features,semantic similarity features of question and entity relations,character similarity features of question and entity,and semantic similarity features of question and entity,so as to implement entity disambiguation and improve entity linking.On this basis,the proposed system can provide more accurate subject entities for the question classification part and the optimal path selection part of the system to improve system performance.The experimental results show that the average F1 value of the proposed system on the verification set of CCKS2019-CKBQA evaluation data reaches 72.08%.Its entity linking module based on the multi-feature disambiguation model displays an accuracy of 90.84%,which is 6.35 percentage points higher than the module based on the popularity disambiguation model and 0.11 percentage points higher than the top 1 in CCKS2019-CKBQA evaluation competition.

作者张鹏举贾永辉陈文亮 ZHANG Pengju;JIA Yonghui;CHEN Wenliang(School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China)

机构地区苏州大学计算机科学与技术学院

出处《计算机工程》 CAS CSCD 北大核心 2022年第2期47-54,共8页 Computer Engineering

基金国家自然科学基金(61876115)。

关键词实体链接实体消歧主题实体知识图谱问答问答系统问句分类最优路径选取 entity linking entity disambiguation subject entity Knowledge Based Question Answering(KBQA) question answering system question classification optimal path selection

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献3

1毛先领,李晓明.问答系统研究综述[J].计算机科学与探索,2012,6(3):193-207. 被引量：59
2孙建军.链接分析:知识基础、研究主体、研究热点与前沿综述——基于科学知识图谱的途径[J].情报学报,2014,33(6):659-672. 被引量：30
3王思宇,邱江涛,洪川洋,江岭.基于知识图谱的在线商品问答研究[J].中文信息学报,2020,34(11):104-112. 被引量：11

二级参考文献76

1杨迅,李风华.超链接的法律问题探析[J].甘肃政法学院学报,2000(3):63-66. 被引量：6
2Zhang D, Lee W S. Question classification using support vector machines[C]//Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '03). New York, NY, USA: ACM, 2003: 26-32.
3Cui Hang, Kan M- Y, Chua T-S. Unsupervised learning of soft patterns for generating definitions from online news [C]//Feldman S I, Uretsky M, Najork M, et al. Proceedings of the 13th International Conference on World Wide Web (WWW 2004), May 17-20, 2004. New York, NY, USA: ACM, 2004: 90-99.
4Collins-Thompson K, Callan J, Terra E, et al. The effect of document retrieval quality on factoid question answering performance[C]//Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '04). New York, NY, USA: ACM, 2004: 574-575.
5Moldovan D, Pasca M, Harabagiu S, et al. Performance issues and error analysis in an open-domain question answering system[J]. ACM Transactions on Information Systems, 2003, 21(2): 133-154.
6Tellex S, Katz B, Lin J, et al. Quantitative evaluation of passage retrieval algorithms for question answering[C]// Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '03). New York, NY, USA: ACM, 2003: 41-47.
7Moldovan D I, Harabagiu S M, Pasca M, et al. The structure and performance of an open-domain question answering system[C]//Proceedings of the 38th Annual Meeting on Association for Computational Linguistics (ACL '00). Stroudsburg, PA, USA: Association for Computational Linguistics, 2000: 563-570.
8Agichtein E, Lawrence S, Gravano L. Learning to find answers to questions on the Web[J]. ACM Transactions on Internet Technology, 2004, 4(2): 129-162.
9Clarke C, Cormack G, Kisman D, et al. Question answering by passage selection (multi text experiments for TREC-9) [C]//Proceedings of the 9th Text Retrieval Conference (TREC-9), 2000.
10Ittycheriah A, Franz M, Zhu W-J, et al. IBM's statistical question answering system[C]//Proceedings of the 9th Text Retrieval Conference (TREC-9), 2000.

共引文献97

1陈开,许又泉.基于Internet的面向专业学习的答疑系统设计[J].数字技术与应用,2012,30(2):149-149.
2詹学义.用截交线法判定轧件的咬入接触点[J].轧钢,2000,17(1):11-13.
3李连喜.Wiki在图书馆知识服务系统中的应用[J].图书馆学刊,2013,35(9):100-104. 被引量：3
4侯永帅,张耀允,王晓龙,陈清财,王宇亮,户保田.中文问答系统中时间敏感问句的识别和检索[J].计算机研究与发展,2013,50(12):2612-2620. 被引量：4
5郑诚,刘娇丽,项珑.基于VSM和LDA模型的FAQ问答系统[J].计算机技术与发展,2014,24(1):133-135. 被引量：3
6石凯,谌志群.基于微信的自动问答系统研究[J].计算机时代,2014(9):9-11. 被引量：8
7杨喆.国内社会化问答网站研究综述[J].网络安全技术与应用,2014(9):181-181. 被引量：6
8丁菲菲,杨思春,刘仁金.基于平均信息熵的中文问句关键词提取[J].皖西学院学报,2014,30(5):46-49. 被引量：1
9吴嘉伟,梁志剑.基于客服聊天记录的问答语料标注系统设计[J].电脑开发与应用,2015,28(2):27-30.
10镇丽华,王小林,杨思春.自动问答系统中问句分类研究综述[J].安徽工业大学学报（自然科学版）,2015,32(1):48-54. 被引量：10

同被引文献64

1周俊,郑彭元,袁立存,戈为溪,梁静.基于改进CASREL的水稻施肥知识图谱信息抽取研究[J].农业机械学报,2022,53(11):314-322. 被引量：7
2宋鹏程,单丽莉,孙承杰,林磊.基于查询路径排序的知识库问答系统[J].中文信息学报,2021,35(11):109-117. 被引量：6
3张秋颖,傅洛伊,王新兵.基于BERT-BiLSTM-CRF的学者主页信息抽取[J].计算机应用研究,2020,37(S01):47-49. 被引量：14
4花俊,胡庆松,李俊,张丽珍,申屠基康,章守宇.海洋牧场远程水质监测系统设计和实验[J].上海海洋大学学报,2014,23(4):588-593. 被引量：15
5陆伟,武川.实体链接研究综述[J].情报学报,2015,34(1):105-112. 被引量：19
6刘峤,李杨,段宏,刘瑶,秦志光.知识图谱构建技术综述[J].计算机研究与发展,2016,53(3):582-600. 被引量：956
7汪沛,线岩团,郭剑毅,文永华,陈玮,王红斌.一种结合词向量和图模型的特定领域实体消歧方法[J].智能系统学报,2016,11(3):366-375. 被引量：6
8孙琛琛,申德荣,寇月,聂铁铮,于戈.面向实体识别的聚类算法[J].软件学报,2016,27(9):2303-2319. 被引量：8
9邢旭峰,王刚,李明智,陈勇,田涛.海洋牧场环境信息综合监测系统的设计与实现[J].大连海洋大学学报,2017,32(1):105-110. 被引量：17
10漆桂林,高桓,吴天星.知识图谱研究进展[J].情报工程,2017,3(1):4-25. 被引量：231

引证文献6

1王荣坤,宾晟,孙更新.融合多特征和由粗到精排序模型的短文本实体消歧方法[J].青岛大学学报（自然科学版）,2022,35(3):16-21. 被引量：1
2聂同攀,曾继炎,程玉杰,马梁.面向飞机电源系统故障诊断的知识图谱构建技术及应用[J].航空学报,2022,43(8):40-56. 被引量：29
3葛睿夫,林越,高祖标.注塑工艺缺陷知识图谱的构建及应用[J].计算机应用文摘,2023,39(13):30-33.
4刘昀抒,申彦明,齐恒,尹宝才.基于层次结构图的多跳知识图谱问答模型[J].计算机工程,2024,50(1):101-109.
5张栋,杨颜聪,刘浩晨,孟靖雅,张德林,张海瑜.海洋牧场装备监测系统知识图谱构建及应用[J].农业工程,2023,13(12):37-43.
6谈川源,贾永辉,陈文亮,陈跃鹤.面向知识图谱问答的查询图生成方法[J].中文信息学报,2024,38(5):117-126.

二级引证文献30

1袁野,刘佳伟,赵惠浞,左志平,葛超,朱晋锐.基于知识图谱的钢厂设备故障智能诊断技术研究与应用[J].冶金设备,2023(S02):20-25.
2张宇翔,冯锐莉.一种飞机电源网络测试技术的研究[J].科技创新导报,2022,19(14):29-31.
3谢雨希,杨江平,孙知建,李逸源,胡欣.雷达装备故障原因知识图谱构建研究[J].现代防御技术,2022,50(5):114-121. 被引量：3
4刘成勇,项邦豪,张东方,甘浪雄,束亚清,许毅.船舶现场监督业务的知识图谱构建方法[J].大连海事大学学报,2022,48(4):38-47. 被引量：1
5高龙,卫青延,陶剑,武铎,王孝天,董洪飞.事理图谱赋能的航空数据智能技术研究[J].航空工程进展,2023,14(2):178-190. 被引量：1
6贾宝惠,姜番,王玉鑫,王杜.基于民机维修文本数据的故障诊断方法[J].航空学报,2023,44(5):253-267. 被引量：3
7邱凌,张安思,张羽,李少波,李传江,杨磊.面向无人机故障诊断的知识图谱构建应用方法[J].计算机工程与应用,2023,59(9):280-288. 被引量：4
8卞嘉楠,冒泽慧,姜斌,马亚杰,刘文静.基于知识图谱和多任务学习的工业生产关键设备故障诊断方法[J].中国科学：信息科学,2023,53(4):699-714. 被引量：9
9宫法明,董文吉,袁向兵.基于知识图谱的潜油电泵井故障诊断[J].计算机系统应用,2023,32(5):87-96. 被引量：5
10蔡安江,张妍,任志刚.煤矿综采设备故障知识图谱构建[J].工矿自动化,2023,49(5):46-51. 被引量：3

1余兴武,郑大元,韩鹏,杨明川.设备备件重复库存自动处理算法的应用与研究[J].中国设备工程,2021(22):10-11.
2张春祥,唐利波,高雪瑶.半监督卷积神经网络的词义消歧[J].西南交通大学学报,2022,57(1):11-17. 被引量：1

计算机工程

2022年第2期

浏览历史

内容加载中请稍等...

基于多特征实体消歧的中文知识图谱问答被引量：6

参考文献3

二级参考文献76

共引文献97

同被引文献64

引证文献6

二级引证文献30

相关作者

相关机构

相关主题

浏览历史

基于多特征实体消歧的中文知识图谱问答 被引量：6

参考文献3

二级参考文献76

共引文献97

同被引文献64

引证文献6

二级引证文献30

相关作者

相关机构

相关主题

浏览历史

基于多特征实体消歧的中文知识图谱问答被引量：6