期刊文献+

An Incremental Algorithm of Text Clustering Based on Semantic Sequences 被引量:1

An Incremental Algorithm of Text Clustering Based on Semantic Sequences
下载PDF
导出
摘要 This paper proposed an incremental textclustering algorithm based on semantic sequence. Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate cluster with minimum entropy overlap value was selected as a result cluster every time in this algorithm. The comparison of experimental results shows that the precision of the algorithm is higher than other algorithms under same conditions and this is obvious especially on long documents set. This paper proposed an incremental textclustering algorithm based on semantic sequence. Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate cluster with minimum entropy overlap value was selected as a result cluster every time in this algorithm. The comparison of experimental results shows that the precision of the algorithm is higher than other algorithms under same conditions and this is obvious especially on long documents set.
出处 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1340-1344,共5页 武汉大学学报(自然科学英文版)
基金 Supported by the National Natural Science Funda-tion of China (60173058)
关键词 text clustering semantic sequence ENTROPY text clustering semantic sequence entropy
  • 相关文献

参考文献10

  • 1Jain,A K,Dubes,R C. Algorithms for Clustering Data . 1988
  • 2Kaufman,L,Rousseeuw,P J. Finding Groups in Data: An Introduction to Cluster Analysis . 1990
  • 3Raymond,T N. Efficient and Effective Clustering Methods for Spatial Data Mining . 1994
  • 4Zhang T,,Ramakrishnan R,Livny M.Birch: An Efficient Data Clustering Method for Large Databases[].// Proceedings of ACM SIGMOD International Conference on Management of Data.1996
  • 5Guha S,,Rastogi R,Shim K.CURE: An Efficient Clustering Algorithm for Large Databases[].Information System Journal.2001
  • 6Karypis G,Han E-H,Kumar V.CHAMELEON: A hierarchical clustering algorithm using dynamic modeling[].IEEE Computer.1999
  • 7Boley D,,Gini M,Gross R,et al.Partitioning-basedclustering for web document categorization[].Deci-sion Support Systems.1999
  • 8Zamir O,Etzioni O.Web Document Clustering: A Feasibility DemonstrationResearch and Development in Information Retrieval[].// Proceedings of the th ACM SIGIR Conference on Research and Development in Information Retrieval.1998
  • 9Dhillon I S,Guan Y,Kogan J.Co-clustering Documents and Words using Bipartite Spectral Graph Partitioning [ C ][].//Proceedings of the th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2001
  • 10Beil F,,Ester M,Xu X W.Frequent Term-Based Text Clustering[].// Proceedings of the th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2002

同被引文献8

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部