基于混合主题模型的文本蕴涵识别被引量：2

Textual Entailment Recognition Based on Mixed Topic Model

下载PDF

导出

摘要分析识别文本蕴涵的主流方法,并基于文本T和假设H可以从潜在混合主题中生成的猜想,提出一个混合主题模型来识别文本蕴涵,描述一个在混合主题模型上生成文本的概率模型。该模型把文本T和假设H看成是同一语义的不同表达,表示为多模式的数据,若文本T和假设H有蕴涵关系,则它们有相似的主题分布,共享混合词汇表和主题。设计mix LDA和LDA模型的对比实验,并对RTE-8任务进行测试,通过支持向量机对得到的句子相似度和其他词法句法特征进行分类。实验结果表明,基于混合主题模型的文本蕴涵识别具有较高的准确率。 This paper analyses the main method of recognizing textual entailment,and proposes a method named mixed topic model to recognize textual entailment, and describes a probabilistic model based on the assumption. Texts are generated by mixtures of latent topics. It takes the T（ Text） and H（ Hypothesis） as a different expression of the same semantic mean. These can be represented as multi mode data. If text entails hypothesis,they have the similar probability distribution of the topic,shares the same mixed bag of words and topics. The model is used in the task RTE-8,parallel tests of mixLDA and LDA models are designed,and a system experiment uses the Support Vector Machine（ SVM） to classify the features which consist of the textual similarity made by this model and other features. Experimental result demonstrates the high accuracy of the mixed topic model to recognize textual entailment.

作者盛雅琦张晗吕晨姬东鸿

机构地区武汉大学计算机学院

出处《计算机工程》 CAS CSCD 北大核心 2015年第5期180-184,共5页 Computer Engineering

基金国家自然科学基金资助面上项目"汉语文本推理的资源建设和统计分析研究"(61173062)

关键词文本蕴涵主题模型多模式混合主题隐藏语义支持向量机 textual entailment topic model multi mode mixed topic latent semantic Support Vector Machine （ SVM ）

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献17

1Dagan I,Glickman O,Magnini B.The PASCAL Recognising Textual Entailment Challenge[C]//Proceedings of the 1st PASCAL Machine Learning Challenges Workshop.Berlin,Germany:Springer,2006:177-190.
2袁毓林,王明华.文本蕴涵的推理模型与识别模型[J].中文信息学报,2010,24(2):3-13. 被引量：17
3张鹏,李国臣,李茹,刘海静,石向荣,Collin Baker.基于FrameNet框架关系的文本蕴含识别[J].中文信息学报,2012,26(2):46-50. 被引量：9
4de Marneffe M C,Rafferty A N,Manning C D.Finding Contradictions in Text[C]//Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics.Columbus,USA:Association for Computational Linguistics,2008:1039-1047.
5Malakasiotis P,Androutsopoulos I.Learning Textual Entailment Using SVMs and String Similarity Measures[C]//Proceedings of Workshop on Textual Entailment and Paraphrasing.Stroudsburg,USA:Association for Computational Linguistics,2007:42-47.
6刘茂福,李妍,姬东鸿.基于事件语义特征的中文文本蕴含识别[J].中文信息学报,2013,27(5):129-136. 被引量：11
7石晶,戴国忠.基于知网的文本推理[J].中文信息学报,2006,20(1):76-84. 被引量：8
8Kouylekov M,Magnini B.Recognizing Textual Entailment with Tree Edit Distance Algorithms[C]//Proceedings of the 1st Challenge Workshop on Recognizing Textual Entailment.Washington D.C.,USA:IEEE Press,2005:17-20.
9Kouylekov M,Negri M.An Open-source Package for Recognizing Textual Entailment[C]//Proceedings of the48th Annual Meeting of the Association for Computational Linguistics.Uppsala,Sweden:[s.n.],2010:42-47.
10Lin Dekang,Pantel P.Discovery of Inference Rules for Question-answering[J].Natural Language Engineering,2001,7(4):343-360.

二级参考文献59

1Akhmatova, Elena. Textual Entailment Resolution via Atomic Proposition[C]//Proceedings of the PASCAL Challenges Workshop on Recognising Textual Entailment. 2005.
2Andreevskaia, Alina, Zhuoyan Li and Sabine Berger. Can Shallow Predicate Argument Structure Determine Entailment? [C]//Proceedings of the PASCAL Challenges Workshop on Recognising Textual Entailment. 2005 :.
3Bar-Haim, Roy, Idan Szpektor and Oren Gliekman. Definition and Analysis of Intermediate Entailment Levels[C]//Proceeding of the ACL Workshop on Em pirical Modeling of Semantic Equivalence and Entailment. 2005:55-60.
4Barzilay, Regina and Kathleen McKeown (2001) Extracting Paraphrases from a Parallel Corpus[C]// ACL/EACL. 2001 : 50-57.
5Barzilay, Regina and Lillian Lee. Learning to Paraphrase: An Unsupervised Approach Using Multiple- Sequence Alignment[C]//Proceeding of the NAACLHLT. 2003: 16-23.
6Bos, Johan and Katja Markert. Combining Shallow and Deep NLP Methods for Recognizing Textual En tailment[C]//Proceedings of the PASCAL Challenges Workshop on Recognising Textual Entailment. 2005.
7Dagan, Ido and Oren Glickman. Probabilistic Textual Entailment: Generic Applied Modeling of Language Variability[C]//PASAL workshop on Learning Meth ods for Text Understanding and Mining, Grenoble France. 2004.
8Dagan, Ido, Oren Glickman, Alfio Gliozzo, Efrat Marmorshtein, Carlo Strapparava. Direct Word Sense Matching for Lexical Substitution[C]//COLING-ACL 06. 2006.
9Dagan, Ido, Oren Glickman and Bernado Magnini. The PASCAL Recognising Textual Entailment Challenge[J]. Lecture Notes in Computer Science, 2006,3944:177-190.
10Glickman, Oren and Ido Dagan. Identifying Lexical Paraphrases from a Single Corpus: A Case Study for Verbs [C]//Proceedings of Recent Advantages in Natural Language Processing. 2003.

共引文献31

1李佳,祝铭,刘辰,杨正球.中文本体映射研究与实现[J].中文信息学报,2007,21(4):27-33. 被引量：10
2贾君枝,邰杨芳.基于汉语框架网络本体的文本推理案例研究[J].图书情报工作,2008,52(7):75-78.
3石晶,戴国忠.基于知网的词汇集聚分析[J].现代图书情报技术,2008(9):41-46.
4田卫新,朱福喜,但志平.一种基于修饰关系的自然语言语义分析方法[J].计算机科学,2010,37(5):197-202. 被引量：2
5张鹏,李国臣,李茹,刘海静,石向荣,Collin Baker.基于FrameNet框架关系的文本蕴含识别[J].中文信息学报,2012,26(2):46-50. 被引量：9
6魏雪,袁毓林.基于语义类和物性角色建构名名组合的释义模板[J].世界汉语教学,2013,27(2):172-181. 被引量：30
7刘茂福,李妍,顾进广.基于统计与词汇语义特征的中文文本蕴涵识别[J].计算机工程与设计,2013,34(5):1777-1782. 被引量：4
8杜永萍,张江涛,刘江利.语义蕴涵关系识别中的特征提取方法[J].北京工业大学学报,2013,39(7):1046-1052.
9倪盛俭.文本蕴涵研究现状和发展趋势[J].云南民族大学学报（哲学社会科学版）,2013,30(4):125-129. 被引量：1
10刘茂福,李妍,姬东鸿.基于事件语义特征的中文文本蕴含识别[J].中文信息学报,2013,27(5):129-136. 被引量：11

同被引文献22

1何娟,高志强,陆青健,瞿裕忠.基于词汇相似度的元素级本体匹配[J].计算机工程,2006,32(16):185-187. 被引量：25
2董振东,董强,郝长伶.知网的理论发现[J].中文信息学报,2007,21(4):3-9. 被引量：98
3Androutsopoulos I,Malakasiotis P.A Survey of Paraphrasing and Textual Entailment Methods[J].Journal of Artificial Intelligence Research,2010,38(1):135-187.
4Shnarch E,Dagan I.Lexical Entailment and Its Extraction from Wikipedia[D].Israel,Jaffa:Bar-Ilan University,2008.
5Kouylekov M,Magnini B.Building a Large-scale Repository of Textual Entailment Rules[C]//Pro-ceedings of the5th International Conference on Language Resources and Evaluation.Genoa,Italy:[s.n.],2006:2437-2440.
6Weeds J,Weir D.A General Framework for Distributional Similarity[C]//Proceedings of EMNLP’03.Sapporo,Japan:[s.n.],2003:81-88.
7Weeds J,Weir D,Mc Carthy D.Characterizing Measures of Lexical Distributional Similarity[C]//Proceedings of the20th International Conference on Computational Linguistics).Geneva,Switzerland:[s.n.],2004:1015-1021.
8Lin Dekang.Automatic Retrieval and Clustering of Similar Words[C]//Proceedings of COLING-ACL’98.Montreal,Canada:[s.n.],1998:768-774.
9Szpektor I,Dagan I.Learning Entailment Rules for Unary Templates[C]//Proceedings of the 22nd Inter-national Conference on Computational Linguistics.Manchester,UK:[s.n.],2008:849-856.
10Kotlerman L,Dagan I,Szpektor I,et al.Directional Distributional Similarity for Lexical Inference[J].Natural Language Engineering,2010,16(4):359-389.

引证文献2

1张志昌,周慧霞,姚东任,鲁小勇.基于词向量的中文词汇蕴涵关系识别[J].计算机工程,2016,42(2):169-174. 被引量：7
2郭茂盛,张宇,刘挺.文本蕴含关系识别与知识获取研究进展及展望[J].计算机学报,2017,40(4):889-910. 被引量：27

二级引证文献34

1马天欢.语用视角下复述句生成方式的类型考察[J].中文信息学报,2021,35(10):32-38.
2任函,冯文贺,刘茂福,万菁.基于语言现象的文本蕴涵识别[J].中文信息学报,2017,31(1):184-191. 被引量：4
3任函,孙为.知识图谱在智能教学系统中的应用[J].开封教育学院学报,2017,37(6):171-173. 被引量：8
4张凯,任维平,张仰森,尤建清.基于股民评论信息的股票预测方法研究[J].北京信息科技大学学报（自然科学版）,2017,32(5):67-71. 被引量：1
5郭峰,韩云凤.面向期刊论文的搜索技术的研究与设计[J].信息技术,2018,42(8):59-65. 被引量：4
6谭咏梅,刘姝雯,吕学强.基于CNN与双向LSTM的中文文本蕴含识别方法[J].中文信息学报,2018,32(7):11-19. 被引量：23
7李潇,闵华松,林云汉.一种用于CBR推理机的案例学习算法研究[J].计算机应用研究,2018,35(12):3689-3693. 被引量：3
8余培,行鸿彦,刘刚.中文评论情感分析方法研究[J].电子测量与仪器学报,2018,32(12):197-203. 被引量：5
9张晓冰,杨启亮,邢建春,韩德帅.面向软件模糊自适应的语音式任务目标识别与结构化转换[J].计算机工程,2018,44(4):59-65. 被引量：9
10王飞雪,李芳.基于主题加权LDA模型的情感分类方法[J].西南师范大学学报（自然科学版）,2018,43(9):38-44. 被引量：4

1张晗,盛雅琦,吕晨,姬东鸿.基于短文本隐含语义特征的文本蕴涵识别[J].中文信息学报,2016,30(3):163-171. 被引量：3
2任函,冯文贺,刘茂福,万菁.基于语言现象的文本蕴涵识别[J].中文信息学报,2017,31(1):184-191. 被引量：4
3刘茂福,李妍,顾进广.基于统计与词汇语义特征的中文文本蕴涵识别[J].计算机工程与设计,2013,34(5):1777-1782. 被引量：4
4任函,盛雅琦,冯文贺,刘茂福.基于知识话题模型的文本蕴涵识别[J].中文信息学报,2015,29(6):119-126. 被引量：4
5倪盛俭,姬东鸿.基于图式的文本蕴涵识别初探[J].中文信息学报,2015,29(3):82-87.
6刘茂福,王月,顾进广.基于语义规则的中文矛盾关系识别方法[J].计算机工程与科学,2015,37(4):806-812. 被引量：3
7李妍,刘茂福,姬东鸿.基于支持向量机的中文文本蕴涵识别研究[J].计算机应用与软件,2014,31(4):51-55. 被引量：9
8张志昌,姚东任,刘霞,陈松毅,鲁小勇.融合句法结构变换与词汇语义特征的文本蕴涵识别[J].计算机工程,2015,41(9):199-204. 被引量：5
9张志昌,周慧霞,姚东任,鲁小勇.基于词向量的中文词汇蕴涵关系识别[J].计算机工程,2016,42(2):169-174. 被引量：7
10杜永萍,张江涛,刘江利.语义蕴涵关系识别中的特征提取方法[J].北京工业大学学报,2013,39(7):1046-1052.

计算机工程

2015年第5期

浏览历史

内容加载中请稍等...

基于混合主题模型的文本蕴涵识别被引量：2

参考文献17

二级参考文献59

共引文献31

同被引文献22

引证文献2

二级引证文献34

相关作者

相关机构

相关主题

浏览历史

基于混合主题模型的文本蕴涵识别 被引量：2

参考文献17

二级参考文献59

共引文献31

同被引文献22

引证文献2

二级引证文献34

相关作者

相关机构

相关主题

浏览历史

基于混合主题模型的文本蕴涵识别被引量：2