期刊文献+

基于隐喻词扩展的短文本聚类算法 被引量:1

Short-Text Clustering Algorithm Based on Extension of Metaphorical Words
下载PDF
导出
摘要 针对目前短文本词汇量少、表达形式多样,导致同种类文本聚类方法无效的问题,提出一种利用中文维基百科的丰富词汇间关系对短文本的隐喻词进行扩充的方法,以解决短文本包含信息少、词汇表达形式多样的不足.实验结果表明,该算法可有效提升短文本的聚类效果. Aiming at the problem that short text contained small words and various expressions,which led to ineffective clustering of the same category of text,we proposed a method to extend metaphorical words in short texts by using the rich lexical relationships in Chinese Wikipedia,which solved the shortages of the short text withless information and various lexical expressions.Experimental results show that the algorithm can effectivelyenhance the clustering effect of short text.
作者 王烨 左万利 王英 WANG Ye;ZUO Wanli;WANG Ying(Symbol Computation and Knowledge Engineer of Ministry of Education,College of Computer Science and Technology,Jilin University,Changchun 130012,China)
出处 《吉林大学学报(理学版)》 CAS CSCD 北大核心 2018年第6期1447-1452,共6页 Journal of Jilin University:Science Edition
基金 国家自然科学基金(批准号:60973040) 国家自然科学基金青年科学基金(批准号:61602057)。
关键词 文本聚类 短文本 维基百科 文本扩展 text clustering short text Wikipedia text extension
  • 相关文献

参考文献5

二级参考文献62

  • 1王菁,张焕杰,杨寿保,高鹰.利用集合差异度实现基于内容聚类的P2P搜索模型[J].中国科学院研究生院学报,2007,24(2):241-247. 被引量:2
  • 2彭京,杨冬青,唐世渭,付艳,蒋汉奎.一种基于语义内积空间模型的文本聚类算法[J].计算机学报,2007,30(8):1354-1363. 被引量:44
  • 3搜狗实验室.文本分类语料库[EB/OL].[2008-07-20].http://www.sogou.com/labs/dl/c.html.
  • 4Gupta V, Lehal G S. A survey of text mining tech?niques and applications[J].Journal of Emerging Tech?nologies in Web Intelligence, 2009, 1 ( 1 ) ; 60 -76.
  • 5Alexander P, Patrick P. Twitter as a corpus for senti- ment analysis and opinion mining[CJ / / Proceedings of the Seventh International Conference on Language Re?sources and Evaluation. Valletta, Malta, 20 10 ; 19 - 21.
  • 6Navigli R. Word sense disambiguation; a survey[J] . ACM Computing Surveys, 2009, 41 (2); 1 - 6.
  • 7Zhang W, Yoshida T, Tang X. Text classification based on multi-word with support vector machine[J] . Knowledge-Based Systems, 2008, 21 (8) ; 879 - 886.
  • 8Sun A. Short text classification using very few words[CJ / / Proceedings of the 35th International ACM SI?GIR Conference on Research and Development in Infor?mation Retrieval. New York, USA, 2012; 1145 - 1146.
  • 9Cilibrasi R L, Vitanyi P M B. The google similarity distance[J]. IEEE Transactions on Knowledge and Da?ta Engineering, 2007 , 19 (3) ; 370 - 383.
  • 10Hu X, Zhang X, Lu C, et al. Exploiting Wikipedia as external knowledge for document clustering[CJ / / Pro?ceedings of the 15th ACM SIGKDD International Con?ference on Knowledge Discovery and Data Mining. Par?is, France, 2009; 389 - 396.

共引文献41

同被引文献12

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部