期刊文献+

基于潜在语义分析的汉语问答系统答案提取 被引量:44

Answer Extracting for Chinese Question-Answering System Based on Latent Semantic Analysis
下载PDF
导出
摘要 为了解决在汉语问答系统答案提取时,由于词的同义或多义现象而导致的“漏提”或“错提”等问题,提出了一种基于潜在语义分析(LSA)的问题和答案句子相似度计算方法.它利用空间向量模型作为问题和句子的表示方法,借助于潜在语义分析理论,对大量问答作句子语料统计分析,构建了一个潜在的词-句子语义空间,从而消除了词之间的相关性,并在语义空间上实现了问题与答案句子相似度计算,有效地解决了词的同义和多义问题.最后结合问题类型和相似度计算结果,对汉语基于事实的简单陈述问题进行了答案句子提取实验.答案提取的MRR值达到了0.47,明显优于空间向量模型.结果说明该方法具有很好的效果. When extracting answers in Chinese question-answering system, synonymy will cause to lose several correct answers, and polysemy will cause to extract wrong answers. In order to solve these problems, this paper proposes a method to calculate similarity between question and sentence based on Latent Semantic Analysis (LSA). This method represents the question and sentence with space vector model, statistically analyzes the abundant question-answering sentence pair corpus with the help of latent semantic analysis theory, and constructs a latent word-sentence semantic space, which gets rids of the correlativity between word. And then similarity calculation between question and sentence is implemented in this semantic space. So the question of synonymy and polysemy is solved effectively. Finally, combining question type and similarity between question and sentence, the experiment on extracting sentence as answer for Chinese factoid question is done. The MRR value with LSA is 0.47, which is better than VSM obviously. The results show that this method makes a very better effect.
出处 《计算机学报》 EI CSCD 北大核心 2006年第10期1889-1893,共5页 Chinese Journal of Computers
基金 教育部博士点基金(20050007023) 国家自然科学基金(60663004) 云南省信息技术基金(2002IT03)资助.
关键词 问答系统 答案提取 相似度 向量空间模型 潜在语义分析 question-answering system answer extracting similarity Vector Space Model (VSM) Latent Semantic Analysis (LSA)
  • 相关文献

参考文献10

二级参考文献50

  • 1梅家驹.同义词词林[M].上海:上海辞书出版社,1989..
  • 2陈磊.基于HNC语义分析的中文标题分类方法.计算语言学文集[M].北京:清华大学出版社,1999.371-375.
  • 3战学钢 姚天顺.基于汉语分析的中文标题分类方法.中文信息处理国际会议论文集[M].北京:清华大学出版社,1998.321-324.
  • 4-.中国分类主题词表,分类号-主题词对应表,第一卷[M].北京:华艺出版社,1994..
  • 5-.中国分类主题词表,主题词-分类号对应表,第二卷[M].北京:华艺出版社,1994..
  • 6[8]Ulf Hermjakob. Parsing and Question Classification for Question Answering. Proceeding of the workshop on Open-Domain Question Answering at ACL-2001
  • 7[9]Eugene Agichtein, Steve Lawrence, Luis Gravano. Learning Search Engine Specific Query Transformations for Question Answering. ACM 2001,169- 178
  • 8[10]Soo-Min Kim, ae-Ho Baek, Sang-Beom Kim, Hae-Chang Rim Question Answering Considering Semantic Categories and Co-occurrence Density. Proceedings of the night Text Retrieval Conference (TREC-9)
  • 9[11]Marius Pasca, Sanda Harabagiu. High-Performance Question/Answering. 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( Sigir-01 ). New Orleans, LA. September 9 - 13,2001
  • 10[1]Ittycheriah,M. Franz,W-J Zhu,A. Ratnaparkhi. IBM's Statistical Question Answering System. Proceedings of the night Text Retrieval Conference (TREC-9)

共引文献269

同被引文献439

引证文献44

二级引证文献201

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部