摘要
为进一步改善短文本关键词抽取的效果,提出一种TTKE(topical translation for keyphrase extraction)主题翻译模型。结合主题模型与统计机器翻译模型的优势,通过长文本辅助短文本进行主题发现,学习特定主题下词语与关键词的对齐概率,为给定短文本进行关键词抽取。在真实数据集上进行实验,实验结果表明,该模型能够有效提高短文本关键词抽取的效果。
To improve the effect of the keyword extraction for short text,a topical translation model for keyphrase extraction was proposed,namely TTKE model,which combined the advantages of both topic model and statistical machine translation model.The topic discovery of short texts was designed by utilizing topic-related long texts,and the topic-specific alignment probability between words and keyphrases was estimated.Keyphrase extraction was achieved for given short texts.Experimental results on real dataset demonstrate that the proposed model can effectively improve the effect of the keyphrase extraction of short texts.
作者
王瑞
秦永彬
张丽
闫盈盈
WANG Rui1 , QIN Yong-bin1,2, ZHANG Li1, YAN Ying-ying1(1. College of Computer Science and Technology, Guizhou University, Guiyang 550025, China;2.Guizhou Provincial Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, Chin)
出处
《计算机工程与设计》
北大核心
2018年第6期1633-1638,共6页
Computer Engineering and Design
基金
国家自然科学基金项目(61540050)
贵州省重大应用基础研究基金项目(黔科合JZ字[2014]2001)
贵州省科技重大专项计划基金项目(黔科合重大专项字[2017]3002)
关键词
关键词抽取
短文本
长文本
主题翻译模型
主题发现
对齐概率
keyphrase extraction
short text
long text
topical translation model
topic discovery
alignment probability