期刊文献+

用于短文本关键词抽取的主题翻译模型 被引量:1

Topical translation model for short text keyphrase extraction
下载PDF
导出
摘要 为进一步改善短文本关键词抽取的效果,提出一种TTKE(topical translation for keyphrase extraction)主题翻译模型。结合主题模型与统计机器翻译模型的优势,通过长文本辅助短文本进行主题发现,学习特定主题下词语与关键词的对齐概率,为给定短文本进行关键词抽取。在真实数据集上进行实验,实验结果表明,该模型能够有效提高短文本关键词抽取的效果。 To improve the effect of the keyword extraction for short text,a topical translation model for keyphrase extraction was proposed,namely TTKE model,which combined the advantages of both topic model and statistical machine translation model.The topic discovery of short texts was designed by utilizing topic-related long texts,and the topic-specific alignment probability between words and keyphrases was estimated.Keyphrase extraction was achieved for given short texts.Experimental results on real dataset demonstrate that the proposed model can effectively improve the effect of the keyphrase extraction of short texts.
作者 王瑞 秦永彬 张丽 闫盈盈 WANG Rui1 , QIN Yong-bin1,2, ZHANG Li1, YAN Ying-ying1(1. College of Computer Science and Technology, Guizhou University, Guiyang 550025, China;2.Guizhou Provincial Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, Chin)
出处 《计算机工程与设计》 北大核心 2018年第6期1633-1638,共6页 Computer Engineering and Design
基金 国家自然科学基金项目(61540050) 贵州省重大应用基础研究基金项目(黔科合JZ字[2014]2001) 贵州省科技重大专项计划基金项目(黔科合重大专项字[2017]3002)
关键词 关键词抽取 短文本 长文本 主题翻译模型 主题发现 对齐概率 keyphrase extraction short text long text topical translation model topic discovery alignment probability
  • 相关文献

参考文献5

二级参考文献65

  • 1李素建,王厚峰,俞士汶,辛乘胜.关键词自动标引的最大熵模型应用研究[J].计算机学报,2004,27(9):1192-1197. 被引量:93
  • 2谭胜,马静,吴一占.基于主题描述模型的相关性判断在网页信息抽取中的应用[J].情报学报,2011,30(2):155-159. 被引量:6
  • 3耿焕同,蔡庆生,于琨,赵鹏.一种基于词共现图的文档主题词自动抽取方法[J].南京大学学报(自然科学版),2006,42(2):156-162. 被引量:30
  • 4TURNEY P D. Learning to extract key phrases from text, NRC Technical Report ERB-1057 [ R ]. Canada : National Research Council, 1999.
  • 5WITTEN I H, PAYNTER G W, FRANK E, et al. Kea:practical automatic keyphrase extraction [ C ]//Proc of ACM Conference on Digital Libraries. New York:ACM Press, 1999:254-255.
  • 6MEDELYAN O, FRANK E, WITTEN I H. Human-competitive tagging using automatic keyphrase extraction [ C ]//Proc of Conference on Empirical Methods in Natural Language Processing. 2009: 1318- 1327.
  • 7MEDELYAN O, WITTEN I H, MILNE D. Topic indexing with Wikipedia [ C ]//Proc of Wikipedia and AI workshop at the AAAI-08 Conference ( WikiAI08 ). 2008.
  • 8PASQUIER C. Task 5:single document keyphrase extraction using sentence clustering and latent dirichlet allocation [ C ]//Proc of ACL Workshop on Semantic Evaluation. 2010 : 154-157.
  • 9DAVID M B, ANDREW Y N, MICHAEL I J. Latent dirichlet allocation[ J ]. Journal of Machine Learning Research, 2003,3 : 993- 1022.
  • 10KIM S N, MEDELYAN O, KAN Min-yen, et al. SemEval-2010 task 5 : automatic keyphrase extraction from scientific articles [ C ]//Proc of ACL Workshop on Semantic Evaluation. 2010:21-26.

共引文献180

同被引文献6

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部