一种基于随机抽取的有限深度层次聚类

A Hierarchical Clustering Algorithm Based on Random Sampling and Limited Depth

下载PDF

导出

摘要聚类是数据挖掘中的关键问题,吸取了BIRCH算法中构造簇特征树来产生初始聚类中心的方法,提出了一种基于随机抽取的有限深度层次聚类算法(RSLDCH算法),采用随机抽取样本、限制特征树深度、构建叶子节点链表技术从而提高了算法的时间效率和聚类效果.实验表明,RSLDCH较BIRCH在运行速度和聚类效果上有一定的提高. Clustering is a key problem in data mining.A hierarchical clustering algorithm is proposed based on random sampling and limited depth,which absorbs the method BIRCH algorithm used constructing a clustering feature tree to induce a group of initial clustering centers.Simultaneously,the algorithm employs the technologies of random sampling,limiting the depth of the clustering feature tree and creating leaf node link to improve the temporal efficiency and the result of clustering.Experiments show that RSLDCH o...

作者郑晓鸣吕士颖王晓东

机构地区福州大学数学与计算机学院

出处《郑州大学学报（理学版）》 CAS 2007年第3期80-83,共4页 Journal of Zhengzhou University:Natural Science Edition

关键词层次聚类 BIRCH算法 CF CFT RSLDCH算法 hierarchical clustering BIRCH algorithm clustering feature clustering feature tree RSLDCH algorithm

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献8

1刘泉凤,陆蓓.数据挖掘中聚类算法的比较研究[J].浙江水利水电专科学校学报,2005,17(2):55-58. 被引量：9
2[2]Raymond T,Hau N J.Efficient and effective clustering methods for spatial data mining[C]// The 20th VLDB Conference.Santiago,1994:144-155.
3周兵,沈钧毅,彭勤科.基于随机抽样和聚类特征的聚类算法[J].西安交通大学学报,2003,37(12):1234-1237. 被引量：6
4[4]Zhang T,Ramakrishnan R,Livny M.BIRCH:an efficient data clustering method for very large databases[C]//The ACM SIGMOD Conference on Management of Data.Montreal,Canada,1996:103-114.
5李雯睿,白晨希.一种本体学习模型的设计与实现[J].河南大学学报（自然科学版）,2006,36(4):100-102. 被引量：4
6王建会,申展,胡运发.一种实用高效的聚类算法[J].软件学报,2004,15(5):697-705. 被引量：26
7[7]Ester M,Kriegel H P,Sander J,et al.A density-based algorithm for discovering clusters in large spatial database with noise[C]//2nd Int'l Conf on Knowledge Discovering in Databases and Data Mining.Portland,USA,1996:226-231.
8[8]Yang Jianwu,Chen Xiaoou,A semi-structured document model for text mining[J].Journal of Computer Science and Technology,2002,17(5):603-610.

二级参考文献19

1[1]Raymond T, Hau N J. Efficient and effective clustering methods for spatial data mining[A]. The 20th VLDB Conference, Santiago, Chile, 1994.
2[2]Zhang T, Ramakrishnan R, Livny M. BIRCH: an efficient data clustering method for very large databases[A]. The ACM SIGMOD Conference on Management of Data, Montreal, Canada, 1996.
3[3]Ester M, Kriegel H P, Sander J, et al. A densitybased algorithm for discovering clusters in large spatial database with noise [A]. 2nd Intl Conf on Knowledge Discovering in Databases and Data Mining, Portland,USA, 1996.
4[4]Guha U,Rastogi R, Shim K. CURE.. an efficient clustering algorithm for large databases [J]. Pergamon Information Systems, 2001, 26(1): 35～58.
5[5]Wang Wei, Yang Jiong, Muntz R. STING: a statistical information grid approach to spatial data mining[A]. The 23rd VLDB Conference, Athens, Greece,1997.
6[6]Gehrke A J, Gunopulos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining applications [A]. The ACM SIGMOD International Conference on Management of Data, Seattle,USA, 1998.
7[7]Vitter J. Random sampling with reservoir [J]. ACM Trans on Mathematical Software, 1985, 11 (1): 37 ～57.
8[8]Motwani R, Raghavan P. Randomized algorithms [M]. London: Cambridge University Press, 1995.
9Berners-Lee T, Hendler J, Lassila O. The Semantic Web [J]. Scientific American, 2001,284(5):34-43.
10Borst W N. Construction of Engineering Ontologies for Knowledge Sharing and Reuse [D]. Enschede: University of Twente, 1997: 56-71.

共引文献41

1李玉鑑.自适应K-均值聚类算法[J].计算机研究与发展,2007,44(z2):100-104. 被引量：5
2周兵,沈钧毅,彭勤科.Hybrid:一种两阶段的聚类算法[J].计算机工程,2005,31(13):1-3. 被引量：3
3董献洲,司光亚,胡晓峰,吴琳.战略模拟情报分析与信息可视化服务研究[J].系统仿真学报,2005,17(11):2815-2817. 被引量：4
4段敏,张锡恩.基于仿真的通用模拟电路故障知识获取平台[J].计算机工程与设计,2006,27(1):129-131. 被引量：14
5段敏,张锡恩.基于仿真的模拟电路故障知识获取新方法[J].系统仿真学报,2006,18(3):802-804. 被引量：7
6邹加棋,陈国龙,郭文忠.基于图模型的中文文档分类研究[J].小型微型计算机系统,2006,27(4):754-757. 被引量：3
7倪永州,田跃.一种快速模板匹配的波形识别算法[J].传感器世界,2006,12(4):32-34. 被引量：8
8胡爱钦,文益民,陈方.一种等分割聚类算法的改进[J].计算技术与自动化,2006,25(2):75-77.
9蔡江辉,张继福,赵旭俊.二阶段模糊聚类方法研究[J].哈尔滨工程大学学报,2006,27(B07):552-557.
10李卫平,张志鸿.万有引力定律在聚类中的应用[J].安阳工学院学报,2006,5(4):40-43. 被引量：3

1安维默.用Excel抽取样本的操作方法[J].北京统计,2003(5):76-77.
2汇能CFT—Ⅱ测试仪推出Adapter5V适配器[J].设备管理与维修,2015(6):7-8.
3冯延蓬,仵博,郑红燕,孟宪军.WSN中一种目标追踪在线节点调度算法[J].计算机工程,2012,38(11):96-99. 被引量：1
4黎娅,郭江娜.基于数据挖掘的启发式抽样方法研究[J].微计算机信息,2009,25(12):216-217. 被引量：4
5李建伟,徐玉斌.组态系统中图形组态基本图元的实现方案[J].太原重型机械学院学报,2003,24(3):232-235. 被引量：2
6朱艳萍.基于指针链表技术的学生成绩管理系统实现[J].电脑编程技巧与维护,2015(8):56-57.
7莫家庆,胡忠望,林瑜华.基于特定区间承诺值证明机制改进的DAA认证方案[J].计算机科学,2012,39(8):111-114. 被引量：3
8刘赏,黄亚楼,倪维健.流数据聚类模型变化检测策略[J].计算机工程与应用,2006,42(5):15-18.
9盛秋艳.基于Agent的个性化信息检索技术的研究[J].黑龙江工程学院学报,2006,20(2):37-40. 被引量：5
10叶樉,李昕,王桂珍,陈海东,陈为,周敏.适用于GPU的四面体体数据规则化与可视化[J].计算机辅助设计与图形学学报,2011,23(6):933-940. 被引量：5

郑州大学学报（理学版）

2007年第3期

浏览历史

内容加载中请稍等...

一种基于随机抽取的有限深度层次聚类

参考文献8

二级参考文献19

共引文献41

相关作者

相关机构

相关主题

浏览历史