一种基于数据包含度的自动聚类算法被引量：1

An Automatic Clustering Algorithm Based on Data Contained Ratio

下载PDF

导出

摘要聚类分析是机器学习和模式识别领域的一个重要问题,聚类算法常用于解决这类问题。针对传统聚类算法运算量大、不适应任意分布数据聚类的不足,提出了一种基于数据包含度的自动聚类算法。该算法引入数据包含度的概念,能够自动确定聚类个数和聚类中心,并进一步采用跟随策略实现聚类。多组数据的实验验证了自动聚类算法的有效性。对不同分布的数据进行了自动聚类算法与K-means聚类算法的聚类结果比较,实验结果表明自动聚类算法具有很好的聚类性能。 Cluster analysis is an important issue for machine learning and pattern recognition. Clustering algorithm is usually used in solving these problems. A novel automatic clustering algorithm is developed based on data contained ratio. In automatic clustering algorithm which is presented in this paper,the concept of data contained ratio is proposed,the cluster number can be determined automatically based on the data contained ratio,and the relative cluster centers are found similarly Several groups data are used to testify and demonstrate the validity and effectiveness of the cluster algorithm. In addition,the comparison between the traditional K-means cluster algorithm and automatic cluster algorithm is processed. The results demonstrate that the automatic cluster algorithm has high performance in clustering random distribution data set.

作者马云红王成汗江腾蛟张堃 Ma Yunhong Wang Chenghan Jiang Tengjiao Zhang Kun(School of Electronics and Information, Northwestern Polytechnic University, Xi＇an 710072, China)

机构地区西北工业大学电子信息学院

出处《西北工业大学学报》 EI CAS CSCD 北大核心 2016年第5期863-866,共4页 Journal of Northwestern Polytechnical University

基金西北工业大学研究生创意创新种子基金(G2015KY0407) 国家自然科学基金青年基金项目(61401363)资助

关键词聚类算法数据包含度数据局部密度 clustering algorithm data contained ratio data local density

分类号 TP311.5 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献2

1冯少荣,肖文俊.DBSCAN聚类算法的研究与改进[J].中国矿业大学学报,2008,37(1):105-111. 被引量：87
2吴健,崔志明,时玉杰,盛胜利,龚声蓉.基于局部密度构造相似矩阵的谱聚类算法[J].通信学报,2013,34(3):14-22. 被引量：14

二级参考文献27

1刘高军,朱嬿.基于数据挖掘技术的建筑企业信用评价[J].中国矿业大学学报,2005,34(4):494-499. 被引量：21
2王玲,薄列峰,焦李成.密度敏感的谱聚类[J].电子学报,2007,35(8):1577-1581. 被引量：61
3ESTER M,KRIEGEL H,SANDER J,et al.A density-based algorithm for discovering clusters in large spatial databases with noise[C]// Proc of the 1996 2nd Int'l Conf on Knowledge Discovery and Data Mining.Portland:AAAI Press,1996:226-231.
4ESTER M,KRIEGEL H P,SANDER J,et al.Incremental clustering for mining in a data warehousing environment[C]// Proceedings of the Twenty-Fourth Very Large Data Bases(VLDB) Conference.New York:[s.n.],1998:323-333.
5JIN R,YANG G,AGRAWAL G,et al.Shared memory parallelization of data mining algorithms:Techniques,programming interface,and performance[J].IEEE Transactions on Knowledge and Data Engineering,2005,17(1):71-89.
6SHI Yong,SONG Yu-qing,ZHANG Ai-dong.A shrinking-based clustering approach for multidimensional data[J].IEEE Transactions on Knowledge and Data Engineering,2005,17(10):1389-1403.
7CRISTIANINI N, SHAWE-TAYLOR J, KANDOLA J. Spectral kernel methods for clustering[A]. Proceedings of the 13th Advances in Neu- ral Information Processing Systems(NIPS 2001)[C]. Vancouver, Brit- ish Columbia, Canada, 2001. 649-655.
8NG A Y, JORDAN M I, WEISS Y. On spectral clustering: analysis and an algorithm[A]. Proceedings of the 14th Advances in Neural Informa- tion Processing Systems(NIPS 2002)[C]. Vancouver, British Columbia, Canada, 2002. 849-856.
9DUCOURNAU A, BRETI'O A, RITAL S, et al. A reductive approach to hypergraph clustering: an application to image segmentation[J]. Pattern Recognition, 2012, 45(7):2788-2803.
10ZHANG T P, TANG Y Y, .FANG B, et al. Document clustering incorrelation similarity measure space[J]. IEEE Transactions on Knowl- edge and Data Engineering, 2012, 24(6): 1002-1013.

共引文献99

1宗长富,文龙,何磊.基于欧几里得聚类算法的三维激光雷达障碍物检测技术[J].吉林大学学报（工学版）,2020,50(1):107-113. 被引量：24
2毕方明,张虹,曹天杰.非均匀Hilbert曲线的生成算法[J].中国矿业大学学报,2009,38(5):729-734. 被引量：3
3徐德,谭维,杨燕,侯天子,黄乐.I-Miner环境下聚类分析算法研究与实现[J].现代计算机,2009,15(2):30-34.
4陆宇,岳昆,刘惟一.一种基于贝叶斯网的交通拥堵预测方法[J].云南大学学报（自然科学版）,2010,32(S1):355-363. 被引量：5
5赵杰,杨柳.聚类分析算法dBscan的改进与实现[J].微电子学与计算机,2009,26(11):189-192. 被引量：14
6邱萌,乔秀全,徐惠民.一种基于模糊聚类的无线传感器网络定位算法[J].武汉理工大学学报（交通科学与工程版）,2009,33(6):1203-1206. 被引量：1
7沙露,鲍培明,李尼格.基于蚁群系统的聚类算法研究[J].山东大学学报（工学版）,2010,40(3):13-18. 被引量：7
8王丹丹,付华,徐耀松.基于DBSCAN算法的煤矿瓦斯监测信息聚类分析方法研究[J].工矿自动化,2010,36(8):45-48. 被引量：2
9琚春华,梅铮,许翀寰.一种基于主成分和密度的改进型动态数据流聚类算法[J].情报学报,2010,29(4):579-585. 被引量：1
10叶焕倬,吴迪.相似重复记录清理方法研究综述[J].现代图书情报技术,2010(9):56-66. 被引量：21

同被引文献6

1DENG ChenWei,HUANG GuangBin,XU Jia,TANG JieXiong.Extreme learning machines: new trends and applications[J].Science China(Information Sciences),2015,58(2):1-16. 被引量：50
2Tao SUN,Zhi-Hua ZHOU.Structural diversity for decision tree ensemble learning[J].Frontiers of Computer Science,2018,12(3):560-570. 被引量：8
3Ke GUO,Xia-bi LIU,Lun-hao GUO,Zong-jie LI,Zeng-min GENG.基于带约束最大间隔的贝叶斯分类器判别学习方法（英文）[J].Frontiers of Information Technology & Electronic Engineering,2018,19(5):639-650. 被引量：1
4谭翔元,高晓光,贺楚超.基于马尔科夫毯约束的最优贝叶斯网络结构学习算法[J].电子学报,2019,47(9):1898-1904. 被引量：5
5吴玉佳,李晶,宋成芳,常军.基于高效用神经网络的文本分类方法[J].电子学报,2020,48(2):279-284. 被引量：14
6李双明,关欣,赵静,吴斌.一种参数区间交叉类型的目标识别方法[J].北京航空航天大学学报,2020,46(7):1307-1316. 被引量：3

引证文献1

1李双明,关欣,王海滨.参数自适应的析取云模糊置信规则识别方法[J].电子学报,2022,50(2):396-403. 被引量：1

二级引证文献1

1陈婷婷,赵世忠.多传感器信息融合模糊控制模型设计[J].传感技术学报,2023,36(6):911-915. 被引量：2

1李俊,周宇葵.数据挖掘在生物医学工程文献检索中的应用[J].图书馆学研究,2008(1):22-24.
2史慧峰,马晓宁.一种自适应的模糊C均值聚类算法[J].无线通信技术,2016,25(3):40-45. 被引量：6
3杜欣,刘大刚,张开活,申远,赵康,倪友聪.基于统一计算设备架构和基因表达式编程的自动聚类算法[J].计算机应用,2013,33(7):1890-1893. 被引量：1
4姜代红,张三友.基于基因表达式编程的K均值自动聚类算法[J].计算机仿真,2010,27(12):216-220. 被引量：10
5张妨妨,钱雪忠.改进的GK聚类算法[J].计算机应用,2012,32(9):2476-2479. 被引量：4
6崔尚卿,马秀莉,唐世渭,王文清.基于不均匀密度的自动聚类算法[J].计算机工程,2008,34(23):86-88. 被引量：3
7陈琰,李康顺,杨磊.加入动态惩罚因子的GEP自动聚类算法[J].系统仿真学报,2016,28(4):806-814. 被引量：1
8潘章明.半监督的自动聚类[J].计算机应用,2010,30(10):2614-2617. 被引量：2
9王舵,郄君,张娟,李文斌.一种快速词自动聚类算法[J].计算机应用与软件,2010,27(8):276-278. 被引量：3
10王会青,陈俊杰.基于图划分的谱聚类方法的研究[J].计算机工程与设计,2011,32(1):289-292. 被引量：16

西北工业大学学报

2016年第5期

浏览历史

内容加载中请稍等...

一种基于数据包含度的自动聚类算法被引量：1

参考文献2

二级参考文献27

共引文献99

同被引文献6

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

一种基于数据包含度的自动聚类算法 被引量：1

参考文献2

二级参考文献27

共引文献99

同被引文献6

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

一种基于数据包含度的自动聚类算法被引量：1