期刊文献+

一种基于最大公共子图的文本谱聚类算法 被引量:2

A Text Spectral Clustering Algorithm Based on Maximum Common Subgraph
下载PDF
导出
摘要 传统的基于空间向量的文本谱聚类方法容易忽略文本上下文之间的语义联系,通过图结构进行文本表示可以很好的解决这一问题,在此基础上,本文提出了基于最大公共子图的谱聚类算法——SC-MCS算法。该算法通过求解文本之间的最大公共子图来进行文本相似度的计算,最后进行文本聚类。实验结果表明,与传统的基于空间向量的文本谱聚类方法相比,该算法在准确率和召回率都取得了一定的提升。 When using the traditional text spectral clustering method based on vector space,the context semantic relations are easily ignored. But the problem can be solved by representing text through the graph structure,on the basis of which,a spectral clustering algorithm based on the maximum common subgraph was proposed( hereafter called SC-MCS). The algorithm calculates text similarity by solving the maximum common subgraph of texts.The experimental results show that compared with the traditional text spectral clustering method based on vector space,the algorithm has improved accuracy and recall rate.
作者 冯仁群山 陈笑蓉 FENG Renqunshan;CHEN Xiaorong(College of Computer Science and Technology, Guizhou University, Guiyang 550025, Chin)
出处 《贵州大学学报(自然科学版)》 2018年第2期82-87,共6页 Journal of Guizhou University:Natural Sciences
基金 国家自然科学基金项目资助(61363028)
关键词 文本聚类 谱聚类 最大公共子图 text clustering spectral clustering maximum common subgraph
  • 相关文献

参考文献2

二级参考文献35

  • 1刘悦.[D].中科院计算所,2003.
  • 2Baeza-Yates, R. and Ribeiro-Neto, B.,. Modern Information Retrieval [ M], 1st ed. Addison-Wesley-Longman, Reading, MA, 1999.
  • 3Gerard Salton, A. Wong, C. S Yang, A Vector Space Model for Automatic Indexing [ A], Communications of the ACM,1975,18(11).
  • 4J.M. Kleinberg. Authoritative sources in a hyperlinked environment [ A], In Proc. of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 668- 677, San Francisco, California, 1998.
  • 5José Maria Gómez Hidalgo,Text Representation For AutoMatic Text Categoruation[A], 10th Conference of the European Chapter of the Association for Computational Linguistics(EACL03), 2003.
  • 6Lewis, D. D., An Evaluation of Phrasal and Clustered Representations on A Text Categorization Task[ A], Proceedings of the Fifteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval[C], 1992,37 - 50.
  • 7Salton G, E. A. Fox and H. Wu. Extended boolean information retrieval [ A ], Communications of the ACM, 1983,26(11): 1022 - 1036.
  • 8Jain A, Murty M, Flynn P. Data clustering.. A Review[J]. ACM Computing Surveys, 1999,31 (3) : 264-323.
  • 9Fiedler M. Algebraic connectivity of graphs. Czech, Math. J. , 1973,23: 298-305.
  • 10Malik J,Belongie S,Leung T, et al. Contour and texture analysis for image segmentation In Perceptual Organization for Artificial Vision Systems. Kluwer, 2000.

共引文献204

同被引文献15

引证文献2

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部