可处理混合属性的任意形状聚类被引量：2

Arbitrary shape clustering for mixed attributes dataset

下载PDF

导出

摘要聚类是数据挖掘中一个非常活跃的研究分支,任意形状的聚类则是一个有待研究的开放问题。提出一种包含分类属性取值频率信息的类间差异性度量和一种对象与类的相似度定义,在此基础上提出一种能处理任意形状的聚类算法,可处理混合属性数据集。在人造数据集和真实数据集上检验了提出的算法,并与相关算法进行了对比,实验结果表明,提出的算法是有效可行的。 Clustering is a very active research branch in data mining field.The research about the arbitrary shape clustering is an open problem.In this paper an inter-cluster dissimilarity measure taking into account the frequency information of the categorical attribute values is introduced.An arbitrary shape clustering algorithm is proposed by defining the similarity degree between an object and a cluster.It can be used for the mixed attributes dataset.The experimental results on the synthetic and real-life datasets show that the proposed algorithm is feasible and effective comparing to other classical algorithms.

作者苏晓珂兰洋程耀东万仁霞

机构地区东华大学信息科学与技术学院郑州轻工业学院计算机与通信工程学院。郑州信阳师范学院计算机与信息技术学院中国科学院高能物理研究所计算中心

出处《计算机工程与应用》 CSCD 北大核心 2010年第34期136-139,共4页 Computer Engineering and Applications

基金国家高技术研究发展计划(863)(No.2006AA01A120) 河南省教育厅自然科学基础研究计划(No.2010A520033) 郑州轻工业学院博士科研基金资助项目~~

关键词任意形状聚类混合属性相似度 arbitrary shape clustering mixed attributes similarity degree

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献9

1HanJ,KamberM.数据挖掘:概念与技术[M].范明,孟小峰,译.2版.北京:机械工业出版社,2006.
2Ding C,He X.K-nearest-neighbor in data clustering:Incorporating local information into global optimization[C]//Proc of the ACM Symp on Applied Computing.Nicosia: ACM Press, 2004: 584-589.
3Gelbard R,Goldman O,Spiegler I.Investigating diversity of clustering methods: An empirical comparison[J].Data & Knowledge Engineering, 2007,63 ( 1 ) : 155-166.
4Karypis G, Han E, Kumar V.CHAMELEON:A hierarchical clustering algorithm using dynamic modeling[J].Computer, 1999, 32 (8) :68-75.
5Birant D, Kut A.ST-DBSCAN:An algorithm for clustering spatial-temporal data[J].Data & Knowledge Engineering, 2007, 60 (1):208-221.
6Jiang S, Song X.A clustering-based method for unsupervised intrusion detections[J].Pattern Recognition Letters,2006,27(5): 802-810.
7Huang Z.Extensions to the k-means algorithm for clustering large data sets with categorical values[J].Data Mining and Knowledge Discovery, 1998,2(3) :283-304.
8蒋盛益,李庆华.一种增强的k-means聚类算法[J].计算机工程与科学,2006,28(11):56-59. 被引量：15
9Asuncion A, Newman D.UCI machine learning repository[EB/OL]. ( 2007 ) .http ://www.ics.uci.edu/-mleam/MLRepository.

二级参考文献9

1J MacQueen.Some Methods for Classification and Analysis of Multivariate Observations[A].Proc 5th Berkeley Symp Mathematics Statist and Probaility[C].1967.281-297.
2H Ralambondrainy.A Conceptual Version of the k-Means Algorithm[J].Pattern Recognition Letters,1995,16(11):1147-1157.
3Zhexue Huang.A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining[A].Proc SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery[C].1997.
4Zhexue Huang.Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values[J].Data Mining and Knowledge Discovery,1998,2(3):283-304.
5C J Merz,P Merphy.UCI Repository of Machine Learning Databases[EB/OL].http://www.ics.uci.edu/ mlearn/ MLRRepository.html,2004-09.
6MIT Lincoln Labs.1999 DARPA Intrusion Detection Evaluation[EB/OL].http://www.ll.mit.edu/IST/ideval/index.html,1999-12.
7G W Milligan,M C Cooper.An Examination of Procedures for Determining the Number of Clusters in a Data Set[J].Psychometrika,1987,50(2):159 -179.
8M Meila,D Heckerman.An Experimental Comparison of Several Clustering and Initialization Methods[A].Proc of the 14th Conf on Uncertainty in Artificial Intelligence[C].1998.386-395.
9C Fraley,A E Raftery.How Many Clusters? Which Clustering Method? Answers via Model-Based Cluster Analysis[J].Computer Journal,1998,41(8):578-588.

共引文献14

1徐鸽,陈江瑞.聚类分析在客户关系管理中的应用研究[J].企业技术开发,2008,27(1):9-11. 被引量：1
2雷红艳,邹汉斌,周慧灿.基于聚类支持向量机的入侵检测算法[J].无线电工程,2009,39(2):45-47. 被引量：4
3邹汉斌,周学清.基于聚类的模糊支持向量机入侵检测算法[J].情报杂志,2009,28(3):175-178. 被引量：3
4张建民.一种改进的K-means聚类算法[J].微计算机信息,2010,26(9):233-234. 被引量：17
5陈朋.基于SPSS和KNIME的K-means聚类结果研究[J].微型机与应用,2010,29(12):1-3. 被引量：4
6吴夙慧,成颖,郑彦宁,潘云涛.K-means算法研究综述[J].现代图书情报技术,2011(5):28-35. 被引量：166
7黎银环,张剑.改进的K-means算法在入侵检测中的应用[J].计算机技术与发展,2013,23(1):165-168. 被引量：3
8蒋盛益,王连喜.聚类分析研究的挑战性问题[J].广东工业大学学报,2014,31(3):32-38. 被引量：6
9滕少华,洪源,李日贵,张巍,刘冬宁.自适应多趟聚类在检测无线传感器网络安全中的应用[J].传感器与微系统,2015,34(2):150-153. 被引量：1
10赵杰,雷秀娟,吴振强.基于最优类中心扰动的萤火虫聚类算法[J].计算机工程与科学,2015,37(2):342-347. 被引量：10

同被引文献20

1蒋盛益,李庆华,赵延喜.一种两阶段异常检测方法[J].小型微型计算机系统,2005,26(7):1237-1240. 被引量：7
2何登发,李德生.沉积盆地动力学研究的新进展[J].地学前缘,1995,2(3):53-58. 被引量：33
3蒋盛益.基于投票机制的融合聚类算法[J].小型微型计算机系统,2007,28(2):306-309. 被引量：7
4蒋盛益,姜灵敏.一种高效异常检测方法[J].计算机工程,2007,33(7):166-168. 被引量：7
5Patcha A, Park J M. An overview of anomaly detection techniques: Existing solutions and latest technological trends[J]. Comp Networks ,2007,51 (12) :3448.
6Jiang M F, Tseng S S, Su C M. Two-phase clustering process for outliers detection[ J]. Computational Statistics and Data Analysis,2001,36 (3) :351.
7Portnoy L, Eskin E, Stolfo S. Intrusion detection with unla- beled data using clustering[ C ]//Proc of the ACM Work- shop on Data Mining Applied to Security, Philadelphia: PA,2001:5 - 8.
8He Z, Xu X, Deng S. Discovering cluster-based local outli- ers [ J ]. Pattern Recognition Letters, 2003, 24 ( 9 - 10) :1651.
9Fred A L. Finding consistent clusters in data partitions [ C ]//Procs of the Second Int Workshop on Multiple Classifier Syst Lecture Notes in Comp Sci, London: Snrineer-Verlag.2001.309 - 318.
10Strehl A, Ghosh J. Cluster ensembles--A knowledge reuse framework for combining multiple partitions [ J ]. J of Ma' chine Learning Research ,2003,3 (3) :583.

引证文献2

1苏晓珂,王秉政.基于聚类融合的异常检测算法[J].郑州轻工业学院学报（自然科学版）,2011,26(3):8-11. 被引量：1
2范海雄,刘付显,夏璐.基于改进GRC和集成技术的混合数据聚类算法[J].计算机工程与应用,2012,48(13):11-15.

二级引证文献1

1王鑫,张涛,金映谷.异常检测算法综述[J].现代计算机,2020,26(30):21-26. 被引量：11

1蔡昌许.一种基于连通性的聚类有效性评价指标[J].计算机应用与软件,2015,32(11):285-288. 被引量：1
2苏晓珂,王秉政.基于聚类融合的异常检测算法[J].郑州轻工业学院学报（自然科学版）,2011,26(3):8-11. 被引量：1
3陈沛帅,琚春华.基于密度与动态阈值的任意形状聚类挖掘算法研究[J].电信科学,2012,28(1):75-81. 被引量：1
4吴枫,仲妍,金鑫,吴泉源,贾焰,杨树强.滑动窗口内进化数据流任意形状聚类算法[J].小型微型计算机系统,2009,30(5):887-890. 被引量：6
5许合利,牛丽君.基于层次与密度的任意形状聚类算法[J].计算机工程,2016,42(7):159-164. 被引量：8
6杨昕,彭玉青.结合蚂蚁算法的K-Means聚类分析[J].河北工业大学学报,2007,36(3):48-52. 被引量：2
7何健,张聪.密度蚂蚁思想的K—Means算法的研究[J].制造业自动化,2012(4):1-3.
8张阿品,徐保国.无监督连接划分聚类算法及其在入侵检测中的应用[J].计算机工程与设计,2006,27(3):384-386. 被引量：3
9王凌,李文峰,郑大钟.非最小相位系统控制器的优化设计[J].自动化学报,2003,29(1):135-141. 被引量：19
10胡学钢,王东波,吴共庆.一种基于层次树的高效密度聚类算法[J].合肥工业大学学报（自然科学版）,2008,31(2):187-190. 被引量：4

计算机工程与应用

2010年第34期

浏览历史

内容加载中请稍等...

可处理混合属性的任意形状聚类被引量：2

参考文献9

二级参考文献9

共引文献14

同被引文献20

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

可处理混合属性的任意形状聚类 被引量：2

参考文献9

二级参考文献9

共引文献14

同被引文献20

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

可处理混合属性的任意形状聚类被引量：2