期刊文献+

基于多项式核的结构化有向树数据聚类算法 被引量:4

Polynomial Kernel Based Structural Clustering Algorithm by Building Directed Trees
下载PDF
导出
摘要 各个点在数据内部的组织结构中自然地扮演着3种不同的结构性角色,分别是毂、质心和野值.在基于邻域的聚类算法中,邻域密度因子能够识别分离数据集中的毂、质心和野值.但是,邻域密度因子对有噪声和重叠的数据往往失效.为了解决该问题,引入了基于多项式核的邻域密度因子,并在有向树框架下,提出了一种结构化的数据聚类算法,其计算复杂度线性于输入数据的大小.对带有噪声和重叠的数据集,该算法能够找到所有显著的、任意形状的不均衡聚类.在人工和真实数据集上的实验结果都证实了该算法的有效性和快速性. Within the internal organization of the data, the data points respectively play three different structural roles: the hub, centroid and outlier. The neighborhood-based density factor (NDF) used in the neighborhood based clustering (NBC) algorithm has the ability of identifying which points act as hubs, centriods or outliers in separated-well data set. However, NDF often works poorly in the circumstances of noise and overlapping. This paper introduces a polynomial kernel based neighborhood density factor (PKNDF) to address this issue. Relying on the PKNDF, a structural data clustering algorithm is further presented which can find all salient clusters with arbitrary shapes and unbalanced sizes in a noisy or overlapping data set. It builds clusters into the framework of directed trees in graph theory and thereby each point is scanned only once in the process of clustering. Hence, its computational complexity is nearly linear in the size of the input data. Experimental results on both synthetic and real-world datasets have demonstrated its effectiveness and efficiency.
出处 《软件学报》 EI CSCD 北大核心 2008年第12期3147-3160,共14页 Journal of Software
基金 国家自然科学基金No.60632050~~
关键词 数据聚类 多项式核 邻域密度因子 有向树 图论 重叠数据 结构性作用 结构化聚类 data clustering polynomial kernel neighborhood-based density factor directed tree graph theory overlapping data structural role structural clustering
  • 相关文献

参考文献21

  • 1Theodoridis S, Koutroumbas K. Pattern Recognition. 2nd ed., New York: Academic Press, 1999.
  • 2Han J, Kamber M. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2000.
  • 3Chen SC, Zhang DQ. Robust image segmentation using FCM with spatial constraints based on new kernel-induced distance measure. IEEE Trans. on Systems, Man, and Cybernetics-Part B: Cybernetics, 2004,34(4):1907-1916.
  • 4Shi J, Malik J. Normalized cuts and image segmentation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2000,26(8): 888-905.
  • 5Breitenbach M, Grudic GZ. Clustering through ranking on manifolds. In: Proc. of the 22nd Int'l Conf. on Machine Learning (ICML 2005), Vol.119. New York: ACM, 2005.73-80.
  • 6Hofmann T, Buhmann JM. Pairwise data clustering by deterministic annealing. IEEE Trans. on Pattern Analysis and Machine Intelligence, 1997,19(1):1-14.
  • 7Fischer B, Zoller T, Buhmann JM. Path based pairwise data clustering with application to texture segmentation. Energy Minimization Methods in Computer Vision and Pattern Recognition. 2001. 235-250.
  • 8Fischer B, Buhmann JM. Bagging for path-based clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2003, 25(11):1411-1415.
  • 9Chang H, Yeung DY. Robust path-based spectral clustering with application to image segmentation. In: Proc. of the 10th IEEE Int'l Conf. on Computer Vision (ICCV). Beijing: IEEE Computer Society, 2005. 278-285.
  • 10Ester M, Kriegel HP, Sander J, Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis E, Han J, Fayyad U, eds. Proc. of the 2nd Int'l Conf. on Knowledge Discovery and Data Mining. Beijing: The AAAI Press, 1996. 221-226.

同被引文献51

  • 1丛蓉,王秀坤,李进军,杨南海.基于层次和密度聚类分析的航迹关联算法[J].系统仿真学报,2005,17(4):841-843. 被引量:7
  • 2李洁,高新波,焦李成.基于特征加权的模糊聚类新算法[J].电子学报,2006,34(1):89-92. 被引量:114
  • 3吴文丽,刘玉树,赵基海.一种新的混合聚类算法[J].系统仿真学报,2007,19(1):16-18. 被引量:18
  • 4DORIGO A,DORIGO M,MANIEZZO V.Distributed optimization by ant colonies[C]//Proc of European Conference on Aritifial Life.Paris:France Elsevier Publishing,1991:134-142.
  • 5DENEUBOURG J L,GOSS S,FRANKS N,et al.The dynamics of collective sorting:robot-like ants and ant-like robots[C]//Proc of the 1st International Conference on Simulation of Adaptive Haviour,From Animals to Animals J.Cambridge MA:MIT Press,1991:356-365.
  • 6LUMER E,FAIETA B.Diversity and adaptation in populations of clustering ants[C]//Proc of the 3rd International Conference on Simulation of Adaptive Behavior:From Animals to Nimats 3.Cambridge,MA:MIT Press,1994:501-508.
  • 7LABROCHE N,MONMARCHE N,VENTURINI G.A new clustering algorithm based on the chemical recognition system of ants[C]//Proc of the 15th European Conference on Artificial Intelligence.2002:345-349.
  • 8LABROCHE N,MONMARCHE N,VENTURINI G.AntClust:ant clustering and Web usage mining[C]//Proc of Genetic and Evolutionary Computation Conference.Berlin:Springer,2003:25-36.
  • 9MULLER K R,MIKA S,ATSCH G,et al.An introduction to kernel-based learning algorithms[J].IEEE Trans on Neural Networks,2001,12(2):181-201.
  • 10XU Rui, Wunsch D. Survey of Clustering Algorithms [J]. IEEE Transactions on Neural Networks, 2005, 16(3): 645 -678.

引证文献4

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部