全粒度聚类算法被引量：2

Whole-granulation cluster algorithm

下载PDF

导出

摘要聚类分析是数据挖掘与知识发现领域的一个重要研究方向.多数聚类算法中相似性是其核心概念之一,对象之间的相似性会被直接或者间接的计算出来.传统的相似性度量方法多是基于单一的粒度去观察两个被测对象.在人类认知过程中,通常采用多粒度来更合理有效地进行问题求解.本文借鉴人类的这种多粒度认知机理,提出一种新的相似性学习方法,称作全粒度相似性度量方法,基于此发展了一种全粒度聚类算法.而全粒度相似性度量从各个角度观察被测对象,进而会得到两个对象间更加真实的相似度.从UCI数据集中选取5组数据进行实验,最后通过与两种传统的聚类方法比较验证了全粒度聚类算法的合理性与有效性. In cluster analysis,especially cluster in an optimization process,one of the decisive factors is the similarity measure employed in the clustering criterion function.By far,all proposed cluster methods have to assume connection among the information objects that applied on.Similarity between every pair objects should be computed,there are two choices which defined as explicitly or implicitly.Hence weather the structure of data can be described by the similarity measure correctly determines the effectiveness of a clustering algorithm.In addition,as one of important characters in human＇s cognition,multi-granulation cognition plays a key role for data modeling.On account of from multiperspective and multi-level to parse one problem,multi-granulation analysis can obtain more reasonable and more satisfied solutions.Through referencing human＇s multi-granulation cognitive ability,in this paper,we introduced a novel similarity measure called whole-granulation similarity measure and apply this similarity measure into clustering criterion function to get a cluster algorithm called whole-granulation cluster algorithm in order to verify the rationalization of whole-granulation similarity measure.The traditional dissimilarity/similarity measure exercise only one single viewpoints,usually is the origin.More informative assessment of similarity could be achieved because whole-granulation takes all sides into consideration.As a leading partitional clustering technique,k-means is one of the most favorite algorithms to be used,because k-means is fast and easy to combine with other methods.Many research putforward the k-means through improve the heuristic function or combine with other method.This is an active aspect to do clustering research.Under this approach we introduce our measure method into cluster analysis through kmeans algorithm as an initial testing.Experiments are conducted with five data sets are selected from UCI machine learning repository.Finally,compared whole-granulation cluster algorithm with two traditional cluster algorithms to verity the validity and proved the rationality of whole-granulation similarity measure at the same time.And the astringency experiment show that whole-granulation similarity measure have a strong performance as a way to measure similarity.

作者李飞江成红红钱宇华

机构地区山西大学计算机与信息技术学院山西大学数学科学学院

出处《南京大学学报（自然科学版）》 CAS CSCD 北大核心 2014年第4期505-516,共12页 Journal of Nanjing University（Natural Science）

基金高等学校博士学科点专项科研基金(20121401110013) 新世纪优秀人才支持计划(NCET-12-1031)

关键词相似性度量聚类分析全粒度 similarity measure cluster whole-granulation

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1徐健锋,张远健,Zhou Duanning,Li Dan,李宇.基于粒计算的不确定性时间序列建模及其聚类[J].南京大学学报（自然科学版）,2014,50(1):87-94. 被引量：7

二级参考文献16

1李道国,苗夺谦,张东星,张红云.粒度计算研究综述[J].计算机科学,2005,32(9):1-12. 被引量：54
2王伦文.聚类的粒度分析[J].计算机工程与应用,2006,42(5):29-31. 被引量：19
3贾澎涛,何华灿,刘丽,孙涛.时间序列数据挖掘综述[J].计算机应用研究,2007,24(11):15-18. 被引量：77
4王国胤,张清华,胡军.粒计算研究综述[J].智能系统学报,2007,2(6):8-26. 被引量：111
5Bettini C,Wang X S,Jajodia S. Testing complex temporal relationships involving multiple granularities and its application to data mining[A].ACM,1996.68-78.
6Hao W N;Zhao E L;Zhang H J.Similarity matching algorithm of multiple granularities time-series data[A]{H}南京,2010861-865.
7Zuo Y F,Liu G H. Similarity match over uncertain time series[A].Sanya,China,2011.1357-1361.
8刘清.粗糙集及粗糙集理论[M]{H}北京:科学出版社,2001.
9苗夺谦;王国胤;刘清.粒计算:过去现在与展望[M]{H}北京:科学出版社,2007373.
10张钹;张铃.问题求解的理论及应用[M]{H}北京:清华大学出版社,2007.

共引文献6

1艾锐峰,欧阳军,程杰,周凯,孙云鹏.实时演进数据序列集的内在模式提取与行为预测[J].计算机系统应用,2018,27(12):75-82.
2姜元凯,郑洪源.基于粗糙模糊集的不确定数据流聚类算法[J].计算机科学与探索,2014,8(12):1494-1501. 被引量：2
3金萍,宗瑜,屈世超,胡燕,田园.面向不确定数据的近似骨架启发式聚类算法[J].南京大学学报（自然科学版）,2015,51(1):197-205. 被引量：12
4郭郁婷,李进金,李克典,郭玉龙.多粒度覆盖粗糙直觉模糊集模型[J].南京大学学报（自然科学版）,2015,51(2):438-446. 被引量：6
5胡克用,胥芳,艾青林,徐红伟,欧阳静.多逆变器光伏发电网络群控策略及实现方法[J].南京大学学报（自然科学版）,2016,52(2):398-408. 被引量：3
6徐健锋,汤涛,严军峰,刘真.基于多机器学习竞争策略的短时交通流预测[J].交通运输系统工程与信息,2016,16(4):185-190. 被引量：8

同被引文献29

1关健,刘大昕.基于主成分分析的无监督异常检测[J].计算机研究与发展,2004,41(9):1474-1480. 被引量：7
2朱六兵,王迪焕,杨斌.粗糙Vague集及其相似度量[J].模糊系统与数学,2006,20(3):130-134. 被引量：11
3Pawlak Z. Rough sets. International Journal of Computer and Science, 1982, 11: 341-356.
4Atanassov K. Intuitionistic fuzzy sets. Fuzzy sets and Systems, 1986, 20(1): 87-96.
5Zhu W, Wang F Y. Reduction and axiomization of covering generalized rough sets. Information Science, 2003, 152(1): 217-230.
6Bonikowski Z, Bryniarski E, Wybraniec-skardowska U. Extensions and intentions in the rough set theory. Information Science, 1998, 107(1): 149-167.
7Atanassov K. Intuitionistic fuzzy sets. Fuzzy Sets and Systems, 1986, 20(1): 87-96.
8Atanassov K. Operators over interval valued intuitionistic fuzzy sets. Fuzzy Sets and Systems, 1994, 64: 159-174.
9Qian Y H, Liang J Y, Dang C Y. Incomplete multigranulation rough set. IEEE Transactions on Systems, Man and Cybernetics, Part A, 2010(20): 420-431.
10Qian Y H, Liang J Y, Yao Y Y, et al. MGRS: A multi-granulation rough set. Information Sciences, 2010, 180(6): 949-970.

引证文献2

1郭郁婷,李进金,李克典,郭玉龙.多粒度覆盖粗糙直觉模糊集模型[J].南京大学学报（自然科学版）,2015,51(2):438-446. 被引量：6
2董新玉,解滨,赵旭升,高新宝.多视角层次聚类下的无线网络入侵检测算法[J].计算机科学与探索,2022,16(12):2752-2764. 被引量：3

二级引证文献9

1范荣华.基于直觉模糊数的物流配送中心选址的评价方法[J].统计与决策,2016,32(23):33-36. 被引量：10
2薛占熬,司小朦,王楠,朱泰隆.基于最小/最大描述的多粒度覆盖粗糙直觉模糊集模型[J].计算机科学,2017,44(1):90-94. 被引量：2
3薛占熬,司小朦,袁艺林,辛现伟.多粒度邻域粗糙直觉模糊集模型[J].模式识别与人工智能,2017,30(1):11-20. 被引量：5
4薛占熬,司小朦,朱泰隆,王楠.乐观和悲观多粒度覆盖粗糙直觉模糊集模型的研究[J].小型微型计算机系统,2017,38(6):1334-1340. 被引量：3
5石素玮,谭安辉.基于诱导覆盖的粗糙直觉模糊集模型[J].南京大学学报（自然科学版）,2017,53(5):947-953. 被引量：1
6张倩倩,马媛媛,徐久成.基于关联熵系数的粗糙Vague集相似性度量方法[J].智能系统学报,2018,13(4):650-655.
7吕广旭,卢加奇,魏先燕,王小英.基于随机森林-聚类混合方法的多分类入侵检测研究[J].现代信息科技,2022,6(16):165-167. 被引量：1
8郭越.基于改进CNN的工业控制网络入侵检测研究[J].机械设计与制造工程,2023,52(6):103-108. 被引量：1
9王洁,吕奕飞.基于流量异常特征的无线网络攻击行为检测方法[J].电脑知识与技术,2024,20(11):78-80. 被引量：1

12007年中国数据挖掘与知识发现学术会议[J].智能系统学报,2007,2(2):47-47.
2陆玉昌.数据挖掘与知识发现[J].中国计算机用户,2000(18):29-29. 被引量：5
3唐晓萍.数据挖掘与知识发现综述[J].电脑开发与应用,2002,15(4):31-32. 被引量：45
4汪全莉.基于Agent的数据挖掘与知识发现[J].情报探索,2008(4):68-69.
5罗敏霞.数据挖掘与知识发现的技术方法及应用(上)[J].运城学院学报,2005,23(2):1-5. 被引量：12
6张治斌,姜亚南.基于WEB日志的数据挖掘研究[J].电脑与信息技术,2009,17(6):68-71. 被引量：1
7罗敏霞.数据挖掘与知识发现的技术方法及应用(下)[J].运城学院学报,2005,23(5):4-6.
8韩晓峰,徐良贤.基于Web服务的多Agent系统的研究[J].计算机仿真,2004,21(1):74-76. 被引量：17
9丁振华,李锦涛,罗海勇,冯波,郭俊波.RFID系统与传感器网络中的数据处理综述[J].计算机应用研究,2008,25(3):660-665. 被引量：10
10赖桃桃,冯少荣.聚类算法中的相似性度量方法研究[J].心智与计算,2008,0(2):176-181. 被引量：8

南京大学学报（自然科学版）

2014年第4期

浏览历史

内容加载中请稍等...

全粒度聚类算法被引量：2

参考文献1

二级参考文献16

共引文献6

同被引文献29

引证文献2

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

全粒度聚类算法 被引量：2

参考文献1

二级参考文献16

共引文献6

同被引文献29

引证文献2

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

全粒度聚类算法被引量：2