期刊文献+

基于图划分的谱聚类算法在文本挖掘中应用 被引量:6

Application of Spectral Cluster Algorithm Based on Graph Partition in Text Mining
下载PDF
导出
摘要 传统文本挖掘算法都是建立在凸球形的样本空间上,当样本空间不为凸时,算法就陷入"局部"最优。为了满足"全局"最优,引进了无向图结构表示文档之间的相似关系,由无向图建立文档之间的相邻接矩阵,谱聚类算法是通过对邻接矩阵进行分析,导出聚类对象的新特征,利用新的特征对原数据进行聚类。通过实验对该算法和其他的文本挖掘的算法进行分析比较,实验结果表明该算法聚类效果比传统数据挖掘方法好。最后指出谱聚类的不足和进一步的研究方向。 Traditional text mining algorithms are based on the sample of spherical convex space, when sample space is not convex, the algorithm performance on a "local" optimization, introduced no direction graph to map similar relationship between documents, then set up adjacent matrix between documents, spectral clustering algorithm analysis adjacent matrix to get the new cluster features, then use the new features to cluser,then it is compared with other algorithm of the text mining by experimental methods. The results showed that the spectral duster algorithm produced good effect. At last, descripts shortage and further research directions of spectral cluster algorithm.
出处 《计算机技术与发展》 2009年第5期96-98,共3页 Computer Technology and Development
基金 河南省自然科学基金项目(0311011700)
关键词 谱聚类 邻接矩阵 文本挖掘 正则割 Laplancian矩阵 spectral cluster adjacent matrix text mining normalized cuts Laplancian matrix
  • 相关文献

参考文献12

  • 1Donth W E,Hoffman A J. Lower bounds for the partitioning of graphs[J]. IBM J Res Develop, 1973,17: 420 - 425.
  • 2Fiedler M. Algebraic connectivity of graphs[ J ]. Czech Math J, 1973,23 : 298 - 305.
  • 3Hagen L,Kahng A B.New spectral methods for ratio cut partitioning and clustering [ J ]. IEEE Tramactions on Computed - Aided Design, 1992,11 (9) : 1047 - 1085.
  • 4Mohar B. Some applations of Laplace eigenvalues of graphs [C]//In: Hahn G eds. Graph Symmetry:Algebraic Methods and Applations, Vol 497 of NATO ASI Series C. [ s. l. ] : Kluwer, 1997:225 - 275.
  • 5卞月华,吴建专,顾国华,等.图论及其应用[M].南京:东南大学出版社,2002.
  • 6周志华,王钰.机器学习及其应用[M].北京:清华大学出版社,2006.
  • 7Stoer M,Wanger. F. A simple min- cut algorithm[J]. Journal of the ACM, 1997,44 (4) : 585 - 591.
  • 8Shi J B,Malik J. Normalized cuts and image segmentation[J ]. IEEE Transaction on Pattern Analysis and Machine Intelligence,2000,22(8) :888 - 905.
  • 9刘泉凤,陆蓓,王小华.文本挖掘中聚类算法的比较研究[J].计算机时代,2005(6):7-8. 被引量:8
  • 10王丽坤,王宏,陆玉昌.文本挖掘及其关键技术与方法[J].计算机科学,2002,29(12):12-19. 被引量:42

二级参考文献19

  • 1Fayyad U M,Piatetsky-Shapiro G,Smyth P.Adavance in Knowledge Discovery and Data Mining.Cambridge MA: AAAI/MIT Press,1996
  • 2John George H.Enhancements to the data mining process: [Ph.D.Thesis].Stanford University, 1997
  • 3Rao A S.AgentSpeak(L):BDI Agents Speak Out in a Logical Computable Language.In:Proc.Eur.Workshop Model.Auto.Agents Multi-Agent World (MAAMAW-96, 7th), 1996.42~55
  • 4梁南元 郑延斌.一个汉语自动分词模型CWSM及自动分词系统PC—CWSS[J].Communications of COLIPS,1991,1(1):51-55.
  • 5Wang XiaoLong,et al.The Problem of Separating Characters into Fewest Words and Its Algorithms.Chinese Science Bulletin,1989,34 (22): 1924~1928
  • 6Salton G,Wong A,Yang C S.A Vector Space Model for Automatic Indexing.Communication of the ACM 1995,18:613~620
  • 7Mladenic D.Machine Learning on non-homogeneous, distributed text data.Doctoral Dissertation, University of Ljubljana,1998
  • 8McCallum A,Nigam K.A Comparison of Event Models for Naive Bayes Text Classification.Just Research 4616 Henry Street Pittsburgh,PA 15213
  • 9McCallum A,Nigam K.Text Classification by Bootstrapping with Keywords, EM and Shrinkage.Just Research 4616 Henry Street Pittsburgh, PA 15213
  • 10The International Journal of Artificial Intelligence.Neural Networks, and Complex Problem-Solving Technologies.http: //textmining.krdl.org.sg/APIN/TWMcfp.html, 2001

共引文献51

同被引文献82

引证文献6

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部