期刊文献+

聚类类别数目自动学习算法研究 被引量:6

Algorithm for automatically learning class number of clustering
下载PDF
导出
摘要 在进行聚类分析的时候,许多算法需要事先给出聚类类别数目。如果在不了解原始数据内部结构的情况下,很难给出合适的聚类类别数目。因此,通过对目前的聚类算法的研究,提出了一种聚类类别数目自动学习算法。算法在分析原始数据的内在分布的基础上,通过统计分布测试,能够准确的判断数据分裂的合理性,最终得到能充分拟合原始数据的聚类类别数目。实验证明算法是可行并且是有效的。 Class number for clustering a dataset is the precondition of the classical algorithm. It is very difficult to confirm the appropriate class number until the original internal structure of dataset is caught on. By researching currently clustering algorithm, a new automatically clustering algorithm is proposed, It can estimate the rationality of data division precisely by evaluating the internal distributing of original data and taking statistics distribution test. Finally, the conclusion that class number is consistent with original data is obtained. By experimeuts,, the new algorithm is proved to be feasible and effective.
作者 王燕
出处 《计算机工程与设计》 CSCD 北大核心 2007年第2期252-253,256,共3页 Computer Engineering and Design
关键词 聚类 类别数目 自动学习 假设检验 统计分布 clustering class number automatically learning hypothesis testing statistics distribution
  • 相关文献

参考文献7

  • 1Han JW,Wen SP.DataMing:Concepts and techniques[M].San Francisco:MorganKau-mann Publishers,2000.
  • 2GrabmeierRAB J,Rudolph A.Techniques of cluster algorithms in data mining[J].Data Mining and Knowledge Discovery,2002,6(4):303-336.
  • 3Jain A K,Murty M N,Flynn P J.Data clustering:A review[J].ACM Computing Surveys,1999,31 (3):264-323.
  • 4张猛,王大玲,于戈.一种基于自动阈值发现的文本聚类方法[J].计算机研究与发展,2004,41(10):1748-1753. 被引量:16
  • 5Dan Pelleg,Andrew Moore.X-means:Extending k-means with efficient estimation of the number of clusters[C].Proceedings of the 17th International Conference on Mathine Learning,2000.727-734.
  • 6Blake C L,Merz C J.UCI repository of learning databases[DB].http://www.ics.uci.eud/~ mlearn/MLRepository.html.
  • 7叶吉祥,谭冠政,路秋静.基于核的非凸数据模糊K-均值聚类研究[J].计算机工程与设计,2005,26(7):1784-1785. 被引量:7

二级参考文献15

  • 1J MacQueen. Some methods for classification and analysis of multivariate observation. In: Proc of the 5th Berkeley Symp Math Statist and Prob 1. California; University of California Press,1967. 281~297
  • 2L Kaufman, P J Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley & Sons,1990
  • 3M Ankerst, M M Breunig, H P Kriegel, et al. OPTICS:Ordering points to identify the clustering structure. In: Proc of the 1999 ACM SIGMOD Int'l Conf on Management of Data (SIGMOD' 99). New York: ACM Press, 1999. 164~169
  • 4A Hotho, G Stumme. Conceptual clustering of text clusters.FGML Workshop, Hannover, 2002
  • 5D S Modha, W S Spangler. Feature weighting in k-means clustering. Machine Learning, 2003, 52(3): 217~237
  • 6F Beil, M Ester, X Xu. Frequent term-based text clustering. In:Proc of 2002 Int Conf Knowledge Discovery and Data Mining.New York: ACM Press, 2002. 436~442
  • 7B B Wang, R I McKay, Hussein AAbbass, etal. A comparative study for domain ontology guided feature extraction. In: Proc of 26th Australian Computer Science Conference (ACSC2003).Darlinghurst, Australia: Australian Computer Society Inc, 2003.69~ 78
  • 8Rong Zhang, Alexander I Rudnicky. A large scale clustering scheme for kernel k-means [J]. Proceedings 16th International Conference, Pattern Recognition,2002.
  • 9Mark Girolami. Mercer kernel based clustering in feature space[J]. IEEE Trans on Neural Networks, 2002, 13(3):780-784.
  • 10Isak Gath, Amir B Geva. Unsupervised optimal fuzzy clustering[J] .IEEE Trans on Pattern and Machine Intelligence, 1989,11(7):773-781.

共引文献21

同被引文献50

引证文献6

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部