期刊文献+

一种基于传统VSM和词共现概念的中文文本聚类的研究 被引量:2

RESEARCH OF CHINESE TEXT CLUSTERING BASEDON TRADITIONAL VSM AND TERM Co-OCCURRENCE
下载PDF
导出
摘要 提出了一种利用传统向量空间模型VSM(VectorSpaceModel)和词共现概念共同表示文档特征的新方法,并将该方法应用于基于平面划分的中文文本聚类中.通过实验,表明基于传统VSM和词共现概念的文本聚类方法与传统的单纯基于关键词集的VSM文本聚类方法相比具有更好的聚类性能,具有一定的实用价值. This paper proposes a new text-representing method based on Traditional VSM and Term Co-occurrence Concept., and the authors apply the method to cluster the Chinese document by a partitional algorithm. The experiment results show the new text-representing method based on Traditional VSM and Term Co-occurrence Concept. Is more effective than the traditional text-representing method only based on
出处 《安徽师范大学学报(自然科学版)》 CAS 2005年第1期27-30,共4页 Journal of Anhui Normal University(Natural Science)
基金 国家自然科学基金(70171052) 皖泰开发基金(143-150401) 安徽省教学研究基金(JYXM2003167).
关键词 VSM 中文文本 文本聚类 文档 向量空间模型 Model) 明基 概念 传统 关键词 .Key words:text clustering VSM term Co-occurrence concept apriori algorithm
  • 相关文献

参考文献5

  • 1HanJiawei MichelineKambe.数据挖掘概念与技术[M].北京:机械工业出版社,2001..
  • 2Salton G.,McGil M.J. An introduction to modern information retrieval[M],New York:McGraw-Hill,1983.
  • 3Steinbach M., Karypis G., Kumar V.A Comparison of Document Clustering Techniques[C].In KDD Workshop on Text Mining, Boston, 2000.
  • 4.[EB/OL].http://www.nlp.org.cn/docs/download.phpdoc_id=281,2004-10-6.
  • 5Hotho A., Staab S., Stumme G.Text Clustering Based on Background Knowledge[R].University of Karlsruhe, Institute AIFB. 2003.

共引文献148

同被引文献25

  • 1魏瑞斌.基于关键词的情报学研究主题分析[J].情报科学,2006,24(9):1400-1404. 被引量:132
  • 2张彰,樊孝忠.一种改进的基于VSM的文本分类算法[J].计算机工程与设计,2006,27(21):4078-4080. 被引量:8
  • 3Galar M, Fern6ndez A, Barrenechea E, et al. Empowering difficult classes with a Similarity- based aggregation in multi-class classification problems [ J ]. Information Sciences, 2014,264 : 135-157.
  • 4Wong S K M,Ziarko W,Wong P C N. Generalized vector spaces model in information retrieval[ C ]//Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 1985 : 18-25.
  • 5Liu G Z. Semantic vector space model: Implementation and evaluation[ J]. Journal of the American Society for Information Science, 1997, 48 (5) : 395-417.
  • 6Nasir J A, Varlamis I, Karim A,et al. Semantic smoothing for text clustering [ J ]. Knowledge-Based Systems, 2013, 54 : 216-229.
  • 7Bagga A, Baldwin B. Algorithms for scoring coreferenee chains [ C ]//The first international conference on language resources and evaluation workshop on linguistics eoreference,1998, 1 : 563-566.
  • 8Amiga E, Gonzalo J, Artiles J, et al. A comparison of extrinsic clustering evaluation metrics based on formal constraints[ J ]. Information Retrieval, 2009, 12 (4) : 461-486.
  • 9Karypis Lab: gCLUTO [ EB/OL ]. [ 2014-01-20 ]. http :// glaros, dtc. umn. edu/gkhome/cluto/gcluto/download.
  • 10Karypis M S G, Kumar V, Steinbach M. A comparison of document clustering techniques [ C ]//TextMining Workshop at KDD2000 ( May 2000) ,2000.

引证文献2

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部