摘要
针对目前短文本词汇量少、表达形式多样,导致同种类文本聚类方法无效的问题,提出一种利用中文维基百科的丰富词汇间关系对短文本的隐喻词进行扩充的方法,以解决短文本包含信息少、词汇表达形式多样的不足.实验结果表明,该算法可有效提升短文本的聚类效果.
Aiming at the problem that short text contained small words and various expressions,which led to ineffective clustering of the same category of text,we proposed a method to extend metaphorical words in short texts by using the rich lexical relationships in Chinese Wikipedia,which solved the shortages of the short text withless information and various lexical expressions.Experimental results show that the algorithm can effectivelyenhance the clustering effect of short text.
作者
王烨
左万利
王英
WANG Ye;ZUO Wanli;WANG Ying(Symbol Computation and Knowledge Engineer of Ministry of Education,College of Computer Science and Technology,Jilin University,Changchun 130012,China)
出处
《吉林大学学报(理学版)》
CAS
CSCD
北大核心
2018年第6期1447-1452,共6页
Journal of Jilin University:Science Edition
基金
国家自然科学基金(批准号:60973040)
国家自然科学基金青年科学基金(批准号:61602057)。