期刊文献+

一种基于相对密度和决策图的聚类算法 被引量:8

A novel clustering algorithm based on relative density and decision graph
原文传递
导出
摘要 聚类是数据挖掘领域的一个重要研究方向,针对复杂数据集中存在的簇间密度不均匀、聚类形态多样、聚类中心的识别等问题,引入样本点k近邻信息计算样本点的相对密度,借鉴快速搜索和发现密度峰值聚类(CFSFDP)算法的簇中心点识别方法,提出一种基于相对密度和决策图的聚类算法,实现对任意分布形态数据集聚类中心快速、准确地识别和有效聚类.在7类典型测试数据集上的实验结果表明,所提出的聚类算法具有较好的适用性,与经典的DBSCAN算法和CFSFDP等算法相比,在没有显著提高时间复杂度的基础上,聚类效果更好,对不同类型数据集的适应性也更广. Clustering is an important research domain in data mining. For some knotty problems in clustering complex datasets, such as uneven densities among clusters, miscellaneous patterns of clusters and the identification of the centers,a clustering method is proposed based on relative density and decision graph, which introduces the idea of k-nearest neighbors to compute the relative densities of data points, and uses the clustering by fast search and find of density peaks(CFSFDP) algorithm for identifying central points, which can identify central points quickly and accurately and cluster datasets of arbitrary distribution effectively. The experimental results on seven typical test datasets show that the proposed clustering algorithm has good feasibility and performance. Compared with the classical density-based spatial clustering of application with noise(DBSCAN) algorithm and CFSFDP algorithm, the proposed algorithm has better clustering effect and accuracy, and has a wider range of adaptation.
作者 周世波 徐维祥 ZHOU Shi-bo;XU Wei-xiang(School of Traffic and Transportation,Beijing Jiaotong University,B eijing 100044,China;Navigation College,Jimei University,Xiamen 361021,China)
出处 《控制与决策》 EI CSCD 北大核心 2018年第11期1921-1930,共10页 Control and Decision
基金 国家自然科学基金项目(61672002 61272029 41501490) 福建省自然科学基金项目(2016J01243)
关键词 聚类 相对密度 决策图 密度峰值 K-近邻 数据挖掘 clustering relative density decision graph density peaks k-nearest neighbors data mining
  • 相关文献

参考文献7

二级参考文献83

  • 1唐发明,王仲东,陈绵云.支持向量机多类分类算法研究[J].控制与决策,2005,20(7):746-749. 被引量:90
  • 2高琰,谷士文,唐琎,蔡自兴.机器学习中谱聚类方法的研究[J].计算机科学,2007,34(2):201-203. 被引量:31
  • 3Han JW, Kamber M. Data Mining: Concepts and Techniques. 2nd ed., San Francisco: Morgan Kaufmann Publishers, 2001. 223-250.
  • 4Ester M, Kriegel HP, Sander J, Xu XW. A density-based algorithm for discovering clusters in large spatial database with noise. In: Simoudis E, Han J, Fayyad UM, eds. Proc. of the 2nd Int'l Conf. on Knowledge Discovery and Data Mining. Portland: AAAI Press, 1996. 226-231.
  • 5Zhang T, Ramakrishnan R, Linvy M. BIRCH: An efficient data clustering method for very large databases. In: Jagadish HV, Mumick IS, eds. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Montreal: ACM Press, 1996. 103-114.
  • 6Guha S, RastogiR, Shim K. CURE: An efficient clustering algorithm for large databases. In: Haas LM, Tiwary A, eds. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 1998. 73-84.
  • 7Ankerst M, Breuning M, Kriegel HP, Sander J. OPTICS: Ordering points to identify the clustering structure. In: Delis A, Faloutsos C, Ghandeharizadeh S, eds. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Philadelphia: ACM Press, 1999. 49-60.
  • 8Karypis G, Han EH, Kumar V. CHAMELEON: A hierarchical clustering algorithm using dynamic modeling. Computer, 1999,32(8): 68-75.
  • 9Hand DJ, Vinciotti V. Choosing k for two-class nearest neighbour classifiers with unbalanced classes. Pattern Recognition Letters, 2003,24(9): 1555-1562.
  • 10Stonebraker M, Frew J, Gardels K, Meredith J. The SEQUOIA 2000 storage benchmark. In: Buneman P, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Washington: ACM Press, 1993.2-11.

共引文献294

同被引文献73

引证文献8

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部