摘要
近年来,基于知识图谱的问答系统逐渐成为学术界和工业界的研究和应用热点方向,而传统方法通常存在效率不高以及未充分利用数据信息的问题。针对以上问题,本文将中文知识图谱问答分为实体抽取和属性选择2个子任务,采用双向长短期记忆条件随机场(Bi-LSTM-CRF)模型来进行实体识别,并提出一种多粒度特征表示的属性选择模型。该模型采用字符级别以及词级别分别对问句和属性进行嵌入表示并通过编码器进行编码,对于属性同时还引入热度编码的信息。通过不同粒度文本表示的结合,并对问句和属性进行相似度计算,最终该系统在NLPCC-ICCPOL 2016 KBQA数据集上取得了73.96%的F1值,能够较好地完成知识图谱问答任务。
Recently,knowledge graph question answering has gradually become the focus of academic and industrial circles.However,traditional methods often have problems of inefficiency and insufficient use of data information. In order to solve the problems above,this paper divides the Chinese knowledge graph question answering into two sub-tasks: entity extraction and property selection. The Bi-LSTM-CRF model is used to identify entities,and a multi-granularity feature representation model is proposed to carry out property selection. The model utilizes character-level and word-level to represent questions and properties and encode them through the encoder. At the same time,it also introduces the one-hot information for the properties. Through the combination of multi-granularity text representations and the similarity calculation of questions and properties,the system finally achieves a 73. 96% F1 value on the NLPCC-ICCPOL 2016 KBQA data set,which finishes the knowledge graph question and answer task successfully.
作者
申存
黄廷磊
梁霄
SHEN Cun;HUANG Ting-lei;LIANG Xiao(School of Electronic,Electrical and Communication Engineering,University of Chinese Academy of Sciences,Beijing 100049,China;Institute of Electronics,Chinese Academy of Sciences,Beijing 100190,China;Key Laboratory of Technology in Geo-Spatial Information Processing and Application System,Chinese Academy of Sciences,Beijing 100190,China)
出处
《计算机与现代化》
2018年第9期5-10,共6页
Computer and Modernization
基金
国家自然科学基金资助项目(61725105
61331017)
关键词
知识图谱
问答系统
实体抽取
属性选择
knowledge graph
question answering system
entity extraction
property selection