期刊文献+

一种混合聚类算法及其应用 被引量:2

A Hybrid Clustering Algorithm and It's Application
下载PDF
导出
摘要 通过分析基于网格与基于密度的聚类算法特征,提出了一种基于网格和密度的混合聚类算法,通过分阶段聚类并选取代表单元中的种子对象来扩展类,从而减少区域查询次数,实现快速聚类。该算法保持了基于密度的聚类算法可以发现任意形状的聚类和对噪声数据不敏感的优点,同时保持了基于网格的聚类算法的高效性,适合对大规模数据的挖掘。实验数据分析验证了算法的有效性,对数据挖掘应用于设备状态监测和故障诊断具有指导意义。 Grounding on the analysis of features of grid-based and density-based clustering methods, a hybrid clustering algorithm based on grid and density was presented. By clustering in two phases and using only a small number of seed objects in representative units to expand the cluster, the frequency of region query can be decreased, and consequently the cost of time is reduced. An equivalent rule was proposed to make smooth conversion between clustering parameters in that two phases. The algorithm keeps good feature of both density-based and grid-based clustering methods. It can discover clusters with arbitrary shape with high efficiency and is insensitive to noise. So it is applicable for data mining on large database. The application of the hybrid algorithm in data analysis of accelerometer demonstrates its effectiveness. It is of instructional meaning for the application of data mining in equipment monitoring and faults diagnosis.
出处 《四川大学学报(工程科学版)》 EI CAS CSCD 北大核心 2006年第5期156-161,共6页 Journal of Sichuan University (Engineering Science Edition)
基金 国家自然科学基金资助项目(50575153)
关键词 数据挖掘 聚类 种子对象 data mining clustering seed object
  • 相关文献

参考文献7

二级参考文献17

  • 1Han JW, Kambr M. Data Mining Concepts and Techniques. Beijing: Higher Education Press, 2001. 145-176.
  • 2Kaufan L, Rousseeuw PJ. Finding Groups in Data: an Introduction to Cluster Analysis. New York: John Wiley & Sons, 1990.
  • 3Ester M, Kriegel HP, Sander J, Xu X. A density based algorithm for discovering clusters in large spatial databases with noise. In:Simoudis E, Han JW, Fayyad UM, eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining.Portland: AAAI Press, 1996. 226-231.
  • 4Guha S, Rastogi R, Shim K. CURE: an efficient clustering algorithm for large databases. In: Haas LM, Tiwary A, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. "73-84.
  • 5Agrawal R, Gehrke J, Gunopolos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining application. In: Haas LM, Tiwary A, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data.Seattle: ACM Press, 1998.94-105.
  • 6Alexandros N, Yannis T,Yannis M. C^2P: clustering based on closest pairs. In: Apers PMG, Atzeni P, Ceri S, Paraboschi S,Ramamohanarao K, Snodgrass RT, eds. Proceedings of the 27th International Conference on Very Large Data Bases. Roma:Morgan Kaufmann Publishers, 2001. 331-340.
  • 7Berchtold S, Bohm C, Kriegel H-P. The pyramid-technique: towards breaking the curse of dimensionality. In: Haas LM, Tiwary A,eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. 142- 153.
  • 8Yu C, Ooi BC, Tan K-L, Jagadish HV. Indexing the distance: an efficient method to KNN processing. In: Apers PMG, Atzeni P,Ceri S, Paraboschi S, Ramamohanarao K, Snodgrass RT, eds. Proceedings of the 27th International Conference on Very Large Data Bases. Roma: Morgan Kaufmann Publishers, 2001. 421--430.
  • 9Zhang W,Proc 23rd VL DB Conf,1997年,186页
  • 10Chen M S,IEEE Trans Knowledge Data Engineering,1996年,8卷,6期,866页

共引文献201

同被引文献23

引证文献2

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部