期刊文献+

基于核心树的增量聚类算法研究

The research of increment clustering algorithm based on core tree
下载PDF
导出
摘要 传统的聚类分析方法一般都没有考虑大容量数据集合的问题,而数据挖掘技术的研究重点之一就是如何从海量数据中高效率地获取知识。结合基于分类方法的K-means中心点算法以及基于层次方法的BIRCH增量算法提出核心树(Core-Tree)的思想来弥补两个算法的缺点,使用中心点的思想来表示BIRCH算法中汇总信息,利用类核心的思想来提高确定中心点的效率。因此,提出一种聚类算法,主要集中在如何提高大型数据集合的聚类效率、如何处理具有各种特征的数据集合。 Clustering analysis in data mining deploys many traditional methods. All these methods have not been considered large volume data sets. However, to efficiently obtain knowledge from large amount of data sets is the top - leading problem in data mining area. Basing on the K - means center points algorithm and the BIRCH increment algorithm, the author poses the concept of core - tree which could make up the weakness of these two algorithms, That is, using center point to indicate the summary information in BIRCH, and using class core to improve the efficiency of center point orientation. Therefore, cluste- ring analysis in aims at improving efficiency of algorithm and ability of processing variant types of data.
作者 丁一 付弦
出处 《湖北师范学院学报(自然科学版)》 2011年第2期18-23,共6页 Journal of Hubei Normal University(Natural Science)
关键词 增量聚类 核心树 中心点 聚类特征 increment clustering core - tree center point clustering feature
  • 相关文献

参考文献6

  • 1Jain A K, Murty M N. Flynn P J. Data clustering: A survey[J]. ACM Computer Survey, 2007, 31:264 -323.
  • 2Brian Lent, Arun N Swami, Jennifer Widom. Clustering Association Rules. In: Alex Gray[ C ]. Proceedings of the 18th International Conference on Data Engineering (ICDE05). Birmingham U. K. 1997. Los Alamitos: IEEE Computer Society,2005. 220-231.
  • 3Eui -Hong Han. Text Categorization Using Weight Adjusted k - Nearest Neighbor Classification[ D]. PhD thesis, University of Minnesota, 2006.
  • 4Tian Zhang, Raghu Ramakrishnan, Miron Livny. BIRCH : An Efficient Data Clustering Method for Very Large Databases. In: H. V. Jagadish, Inderpal Singh Mumick eds[ C ]. Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD04). Montreal, Canada. 2004. New York: ACM Press, 2004. 103 -114.
  • 5杨芳,湛燕,田学东,郭宝兰.使用遗传算法实现K-means聚类算法的K值选择[J].微机发展,2003,13(1):25-26. 被引量:13
  • 6杨广文,史树明.利用确定性退火技术的并行聚类算法[J].清华大学学报(自然科学版),2003,43(4):480-483. 被引量:3

二级参考文献13

  • 1周明 孙树栋.遗传算法原理及其应用[M].北京:国防工业出版社,1996..
  • 2乙米凯莉维茨[美].演化程序--遗传算法和数据编码的结合[M].北京:科学出版社,2000..
  • 3乙米凯莉维茨[美].演化程序--遗传算法和数据编码的结合[M].北京:科学出版社,2000..
  • 4Lloyd S P. Least squares quantization in PCM [J]. IEEE Trans on Information Theory, 1982, 28(1) : 129-137.
  • 5Lind Y, Buzo A, Gray R M. Algorithm for vector quantization [J]. IEEE Trans Communication, 1980,28(1): 84-95.
  • 6Ball G, Hall D. A clustering technique for summmarizing multivariate data [J]. Behavioral Science, 1967, 12:153 - 155.
  • 7Bezdek J C. Pattern Recognition with Fuzzy Objective Function Algorithms [M]. New York: Penum, 1981.
  • 8Gath I, Geva A B. Unsupervised optimal fuzzy clustering[J]. IEEE Trans Pattern and Machine Intell, 1989, 11(7):773 - 781.
  • 9Rose K, Gurewitz E, Fox G C. Statistical mechanics and phase transition in clustering EJ]. Physical Review Letters,1990, 65: 945-948.
  • 10Fox G C. Physical computation [J]. Concurrency: Practice and Experience, 1991, 3(6): 627- 653.

共引文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部