摘要
DBSCAN算法是一种基于密度的聚类算法.针对该算法在处理混合属性数据上的不足,采用面向维度的距离的思想,对不同类型的数据定义不同的相似度度量方法和不同的相似度阈值,减少了对全局相似度阈值的依赖,提出了一种新的适合混合属性数据聚类的算法M-DBSCAN.仿真表明新算法有效解决了DBSCAN算法无法处理混合属性数据的缺点,对混合属性数据有较好的聚类效果.
DBSCAN is a cluster algorithm based on density. In order to avoid the shortcoming of DBSCAN in dealing with mixed attributes, a new mixed attribute cluster algorithm named MDBSCAN based on dimension distance is presented in this paper. The different types of data are defined with different similarity measurement methods and different similarity thresholds. It will reduce the dependence on the global similarity threshold. The simulations show that the new algorithm solves the mixed-attribute data cluster problem efficiently and has a better result than the DBSCAN algorithm.
出处
《浙江工业大学学报》
CAS
北大核心
2009年第4期445-448,共4页
Journal of Zhejiang University of Technology
关键词
数据挖掘
聚类
混合属性
密度
data mining
clustering
mixed attributes
density