基于多项式核的结构化有向树数据聚类算法被引量：4

Polynomial Kernel Based Structural Clustering Algorithm by Building Directed Trees

下载PDF

导出

摘要各个点在数据内部的组织结构中自然地扮演着3种不同的结构性角色,分别是毂、质心和野值.在基于邻域的聚类算法中,邻域密度因子能够识别分离数据集中的毂、质心和野值.但是,邻域密度因子对有噪声和重叠的数据往往失效.为了解决该问题,引入了基于多项式核的邻域密度因子,并在有向树框架下,提出了一种结构化的数据聚类算法,其计算复杂度线性于输入数据的大小.对带有噪声和重叠的数据集,该算法能够找到所有显著的、任意形状的不均衡聚类.在人工和真实数据集上的实验结果都证实了该算法的有效性和快速性. Within the internal organization of the data, the data points respectively play three different structural roles： the hub, centroid and outlier. The neighborhood-based density factor （NDF） used in the neighborhood based clustering （NBC） algorithm has the ability of identifying which points act as hubs, centriods or outliers in separated-well data set. However, NDF often works poorly in the circumstances of noise and overlapping. This paper introduces a polynomial kernel based neighborhood density factor （PKNDF） to address this issue. Relying on the PKNDF, a structural data clustering algorithm is further presented which can find all salient clusters with arbitrary shapes and unbalanced sizes in a noisy or overlapping data set. It builds clusters into the framework of directed trees in graph theory and thereby each point is scanned only once in the process of clustering. Hence, its computational complexity is nearly linear in the size of the input data. Experimental results on both synthetic and real-world datasets have demonstrated its effectiveness and efficiency.

作者丁军娣马儒宁陈松灿

机构地区南京理工大学计算机科学与技术学院南京航空航天大学信息科学与技术学院南京航空航天大学理学院

出处《软件学报》 EI CSCD 北大核心 2008年第12期3147-3160,共14页 Journal of Software

基金国家自然科学基金No.60632050~~

关键词数据聚类多项式核邻域密度因子有向树图论重叠数据结构性作用结构化聚类 data clustering polynomial kernel neighborhood-based density factor directed tree graph theory overlapping data structural role structural clustering

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献21

1Theodoridis S, Koutroumbas K. Pattern Recognition. 2nd ed., New York: Academic Press, 1999.
2Han J, Kamber M. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2000.
3Chen SC, Zhang DQ. Robust image segmentation using FCM with spatial constraints based on new kernel-induced distance measure. IEEE Trans. on Systems, Man, and Cybernetics-Part B: Cybernetics, 2004,34(4):1907-1916.
4Shi J, Malik J. Normalized cuts and image segmentation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2000,26(8): 888-905.
5Breitenbach M, Grudic GZ. Clustering through ranking on manifolds. In: Proc. of the 22nd Int'l Conf. on Machine Learning (ICML 2005), Vol.119. New York: ACM, 2005.73-80.
6Hofmann T, Buhmann JM. Pairwise data clustering by deterministic annealing. IEEE Trans. on Pattern Analysis and Machine Intelligence, 1997,19(1):1-14.
7Fischer B, Zoller T, Buhmann JM. Path based pairwise data clustering with application to texture segmentation. Energy Minimization Methods in Computer Vision and Pattern Recognition. 2001. 235-250.
8Fischer B, Buhmann JM. Bagging for path-based clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2003, 25(11):1411-1415.
9Chang H, Yeung DY. Robust path-based spectral clustering with application to image segmentation. In: Proc. of the 10th IEEE Int'l Conf. on Computer Vision (ICCV). Beijing: IEEE Computer Society, 2005. 278-285.
10Ester M, Kriegel HP, Sander J, Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis E, Han J, Fayyad U, eds. Proc. of the 2nd Int'l Conf. on Knowledge Discovery and Data Mining. Beijing: The AAAI Press, 1996. 221-226.

同被引文献51

1丛蓉,王秀坤,李进军,杨南海.基于层次和密度聚类分析的航迹关联算法[J].系统仿真学报,2005,17(4):841-843. 被引量：7
2李洁,高新波,焦李成.基于特征加权的模糊聚类新算法[J].电子学报,2006,34(1):89-92. 被引量：114
3吴文丽,刘玉树,赵基海.一种新的混合聚类算法[J].系统仿真学报,2007,19(1):16-18. 被引量：18
4DORIGO A,DORIGO M,MANIEZZO V.Distributed optimization by ant colonies[C]//Proc of European Conference on Aritifial Life.Paris:France Elsevier Publishing,1991:134-142.
5DENEUBOURG J L,GOSS S,FRANKS N,et al.The dynamics of collective sorting:robot-like ants and ant-like robots[C]//Proc of the 1st International Conference on Simulation of Adaptive Haviour,From Animals to Animals J.Cambridge MA:MIT Press,1991:356-365.
6LUMER E,FAIETA B.Diversity and adaptation in populations of clustering ants[C]//Proc of the 3rd International Conference on Simulation of Adaptive Behavior:From Animals to Nimats 3.Cambridge,MA:MIT Press,1994:501-508.
7LABROCHE N,MONMARCHE N,VENTURINI G.A new clustering algorithm based on the chemical recognition system of ants[C]//Proc of the 15th European Conference on Artificial Intelligence.2002:345-349.
8LABROCHE N,MONMARCHE N,VENTURINI G.AntClust:ant clustering and Web usage mining[C]//Proc of Genetic and Evolutionary Computation Conference.Berlin:Springer,2003:25-36.
9MULLER K R,MIKA S,ATSCH G,et al.An introduction to kernel-based learning algorithms[J].IEEE Trans on Neural Networks,2001,12(2):181-201.
10XU Rui, Wunsch D. Survey of Clustering Algorithms [J]. IEEE Transactions on Neural Networks, 2005, 16(3): 645 -678.

引证文献4

1王翔,郑建国,王玉玲.核蚁群化学聚类算法[J].计算机应用研究,2010,27(4):1326-1329.
2曲福恒,胡雅婷,马驷良,苑丽红,孙爽滋.基于核的模糊C均值聚类算法的收敛性定理[J].吉林大学学报（理学版）,2011,49(6):1079-1086. 被引量：3
3姚丽娟,罗可.基于粒子群的粗糙核聚类算法[J].计算机应用研究,2012,29(8):2854-2857. 被引量：4
4马儒宁,王秀丽,丁军娣.多层核心集凝聚算法[J].软件学报,2013,24(3):490-506. 被引量：20

二级引证文献27

1马儒宁,王萍,丁军娣.利用核心集粗化的多层聚类算法[J].计算机科学与探索,2013,7(8):729-735.
2李莲,罗可,周博翔.基于粒计算的粗糙集聚类算法[J].计算机应用研究,2013,30(10):2916-2919. 被引量：9
3杨志,罗可.一种改进的基于粒子群的聚类算法[J].计算机应用研究,2014,31(9):2597-2599. 被引量：14
4邱双双.基于核模糊c-均值聚类与阈值分割的SAR影像分割算法[J].科技创新与应用,2014,4(35):15-15. 被引量：1
5韩啸,刘淑芬,徐天琦.基于遗传模拟退火算法的改进K-medoids算法[J].吉林大学学报（工学版）,2015,45(2):619-623. 被引量：9
6何廷年,李晓红,蒋芸.改进多种群差分进化算法的混沌系统参数估计[J].计算机工程,2015,41(2):178-183. 被引量：6
7嵇小辅,张翔.基于FCM与集成高斯过程回归的赖氨酸发酵软测量[J].智能系统学报,2015,10(1):156-162. 被引量：9
8肖文雅,王红云.基于信息熵rough set的多层凝聚入侵检测算法[J].福建电脑,2015,31(7):80-81.
9杨臻,杨志宏.基于多层核心集凝聚思想的视频关键帧提取[J].计算机应用与软件,2015,32(9):144-148. 被引量：1
10蒙祖强,胡玉兰,蒋亮,常红岩.基于混合蛙跳与阴影集优化的粗糙模糊聚类算法[J].控制与决策,2015,30(10):1766-1772. 被引量：8

1宋荣,李霞婷.基于多标记有向树模型的XML文档片段相似度量方法研究[J].电子技术与软件工程,2013(10):49-49.
2王波,丁军娣,陈松灿.TWO IMPROVED GRAPH-THEORETICAL CLUSTERING ALGORITHMS[J].Transactions of Nanjing University of Aeronautics and Astronautics,2012,29(3):263-272. 被引量：2
3林靖,潘广贞.基于多边形包围的无线传感器网络边缘检测算法[J].微电子学与计算机,2015,32(2):153-155.
4柳柏濂.有向树的几个组合问题[J].数学物理学报（A辑）,1991,11(2):194-197.
5陈(钅东).m 元有向树图的绘制[J].计算机应用与软件,1993,10(5):61-64.
6王学玲,王志海,王建林.基于有向树算法构造的TAN分类器[J].计算机工程与设计,2008,29(13):3451-3453. 被引量：1
7康琪,马军.有向标记根树之间的语义编辑距离[J].模式识别与人工智能,2011,24(6):816-824.
8杨清宇,孙凤伟,张曌,张迪,庄健.利用测地线距离的改进谱聚类算法[J].西安交通大学学报,2012,46(8):1-7. 被引量：5
9王甲民,赵天海,沈均毅.用XML表示关系数据[J].计算机工程与应用,2002,38(9):176-179. 被引量：3
10周可,黄永峰,张江陵.网络存储技术研究[J].电子计算机与外部设备,2000,24(2):12-14. 被引量：16

软件学报

2008年第12期

浏览历史

内容加载中请稍等...

基于多项式核的结构化有向树数据聚类算法被引量：4

参考文献21

同被引文献51

引证文献4

二级引证文献27

相关作者

相关机构

相关主题

浏览历史

基于多项式核的结构化有向树数据聚类算法 被引量：4

参考文献21

同被引文献51

引证文献4

二级引证文献27

相关作者

相关机构

相关主题

浏览历史

基于多项式核的结构化有向树数据聚类算法被引量：4