期刊文献+

面向知识库问答中复述问句评分的词向量构建方法 被引量:3

Learning Word Embeddings for Paraphrase Scoring in Knowledge Base Based Question Answering
下载PDF
导出
摘要 传统的词向量构建方法基于句子内部单词间的共现概率,采用与具体任务无关的无监督训练方法实现.文中提出基于复述关系约束的词向量构建方法,用于改进知识库问答中基于词向量和词袋模型的复述问句评分.首先从复述问句库中按一定规则收集得到满足复述关系的问句对和不满足复述关系的问句对,以问句对之间的相似度不等式表示句子级的语义约束信息,再将该不等式作为约束项加入词向量训练的目标函数中.实验表明,相比传统词向量构建方法,文中方法可以提高问句间复述关系评价的准确度及知识库问答系统中问题回答的准确度. The conventional word embeddings are learned from the co-occurrence probabilities between the words within a same sentence. The learning algorithm is task-independent and unsupervised. A method for constructing word embeddings is proposed by utilizing the constraints of paraphrasing to improve the performance of paraphrase scoring with word embeddings and bag-of-words model in knowledge base ( KB) based question answering ( QA) . In the proposed method, the pairs of paraphrase questions and non-paraphrase questions are collected respectively from a database of question paraphrases according to some designed rules. Then, the inequalities describing the similarities between the pairs of questions are adopted to represent the semantic constraint at the sentence level. These inequalities are integrated into the objective function for training word embeddings. Experimental results show that the proposed method improves the accuracies of paraphrase scoring and KB-based question answering compared with conven-tional word embedding methods.
作者 詹晨迪 凌震华 戴礼荣 ZHAN Chendi LING Zhenhua DAI Lirong(National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei 23002)
出处 《模式识别与人工智能》 EI CSCD 北大核心 2016年第9期825-831,共7页 Pattern Recognition and Artificial Intelligence
基金 安徽省科技攻关计划(No.2014z02006) 中央高校基本科研业务费专项资金(No.WK2350000001)资助~~
关键词 知识库问答 复述问句 词向量 Knowledge Base Based Question Answering Question Paraphrase Word Embedding
  • 相关文献

参考文献17

  • 1BOLLACKER K, EVANS C, PARITOSH P, et al. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge // Proc of the ACM SIGMOD International Conference on Management of Data. New York, USA: ACM, 2008: 1247-1250.
  • 2YAO X C, VAN DURME B. Information Extraction over Structured Data: Question Answering with Freebase // Proc of the 52nd Annual Meeting of the Association for Computational Linguistics. New York, USA: Association for Computational Linguistics, 2014: 956-966.
  • 3BERANT J, CHOU A, FROSTIG R, et al. Semantic Parsing on Freebase from Question-Answer Pairs[EB/OL].[2016-02-01].http://www.cs.stanford.edu/~pliang/papers/freebase-emnlp2013.pdf.
  • 4BERANT J, LIANG P. Semantic Parsing via Paraphrasing[EB/OL].[2016-02-01]. http://www.anthology.aclweb.org/P/P14/P14-1133.pdf.
  • 5BORDES A, WESTON J, CHOPRA S. Question Answering with Subgraph Embeddings[EB/OL].[2016-02-01]. http://www.aclweb.org/old_anthology/D/D14/D14-1067.pdf.
  • 6BORDES A, WESTON J, USUNIER N. Open Question Answering with Weakly Supervised Embedding Models // Proc of the European Conference on Machine Learning and Knowledge Discovery in Databases. New York, USA: Springer, 2014: 165-180.
  • 7DONG L, WEI F R, ZHOU M, et al. Question Answering over Freebase with Multi-column Convolutional Neural Networks // Proc of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. New York, USA: Association for Computational Linguistics, 2015: 260-269.
  • 8HARRIS Z S. Distributional Structure // HIZ· H, eds.Papers on Syntax. Amsterdam, The Netherlands: Springer, 1981: 3-22.
  • 9MILLER G A, CHARLES W G. Contextual Correlates of Semantic Similarity. Language and Cognitive Processes, 1991, 6(1): 1-28.
  • 10MIKOLOV T, CHEN K, CORRADO G, et al. Efficient Estimation of Word Representations in Vector Space[EB/OL] .[2016-02-01]. http://arxiv.org/pdf/1301.3781.pdf.

同被引文献12

引证文献3

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部