期刊文献+

基于维度距离的混合属性密度聚类算法研究 被引量:3

Research on density clustering algorithm of mixed attributes based on dimension distance
下载PDF
导出
摘要 DBSCAN算法是一种基于密度的聚类算法.针对该算法在处理混合属性数据上的不足,采用面向维度的距离的思想,对不同类型的数据定义不同的相似度度量方法和不同的相似度阈值,减少了对全局相似度阈值的依赖,提出了一种新的适合混合属性数据聚类的算法M-DBSCAN.仿真表明新算法有效解决了DBSCAN算法无法处理混合属性数据的缺点,对混合属性数据有较好的聚类效果. DBSCAN is a cluster algorithm based on density. In order to avoid the shortcoming of DBSCAN in dealing with mixed attributes, a new mixed attribute cluster algorithm named MDBSCAN based on dimension distance is presented in this paper. The different types of data are defined with different similarity measurement methods and different similarity thresholds. It will reduce the dependence on the global similarity threshold. The simulations show that the new algorithm solves the mixed-attribute data cluster problem efficiently and has a better result than the DBSCAN algorithm.
出处 《浙江工业大学学报》 CAS 北大核心 2009年第4期445-448,共4页 Journal of Zhejiang University of Technology
关键词 数据挖掘 聚类 混合属性 密度 data mining clustering mixed attributes density
  • 相关文献

参考文献7

  • 1HAN Jiawei,KAMBER M.Data mining:concepts and techniques[M].New York:Morgan Kaufmann Puhlishers,2001:251.
  • 2MARQUES J P.Pattern recognition concepts,methods andapplications[M].Beijing:Tsinghua University Press,2002:51-74.
  • 3HUANG Zhexue.Extensions to the k-means algorithm for clustering large data sets with categorical values[J].Data Mining and Knowledge Discovery,1998,2 (3):283-304.
  • 4BARBARA D,CHEN Ping.Using self-similarity to cluster large data sets[J].Data Mining and Knowledge Discovery,2003,7(2):123-152.
  • 5赵立江,黄永青,刘玉龙.改进的混合属性数据聚类算法[J].计算机工程与设计,2007,28(20):4850-4852. 被引量:7
  • 6ESER M,KEIEGEL H P,SANDER J,et al.A density based algorithm for discovering clusters in large spatial databases with noise[C]//2nd ACM SIGKDD Int Conf on Knowledge Discovery and DataMining.New York:ACM Press,1996:226-231.
  • 7WOO K G,LEE J H.FINDIT:a fast and intelligent suhspace clustering algorithm using dimension voting[J].Information and Software Technology,2004,46(4):255-271.

二级参考文献8

  • 1HanJiawei KamberM.Data Mining Concepts and Techniques[M].北京:机械工业出版社,2001..
  • 2Huang Zhexue.Extensions to the k-means algorithm for clustering large data sets with categorical values[J].Data Mining and Knowledge Discovery,1998(2):283-304.
  • 3Daniel Barbara.Using self-similarity to cluster large data sets[J].Data Mining and Knowledge Discovery,2003(7):123-152.
  • 4Dharmendra S Modha,Scott Spangler W.Feature weighting k-means cluytering[J].Machine Learning,2003,52(3):217-237.
  • 5Sun Y,Zhu Q,Chen Z.An iterative initial-points refinement algorithm for categorical data clustering[J].Pattern Recognition Letters,2002,23 (7):875-884.
  • 6Gan G,Yang Z,Wu J.A genetic k-modes algorithm for clustering categorical data[C].Wuhan:Proc of ADMA'05,2005:195-202.
  • 7赵立江.基于数值型和分类型混合属性数据集的聚类算法研究[D].杭州:浙江大学硕士学位论文,2005.
  • 8Blake C,Merz J.UCI repository of machine learning databases[EB/OL].http://www.ics.uci.edu/-mlearn/MLRepository.html.

共引文献6

同被引文献18

引证文献3

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部