期刊文献+

基于中心距序降维的聚类算法 被引量:1

Clustering Algorithm Based on Dimension Reduction by Center Distance Order
下载PDF
导出
摘要 为提高金融业务数据集上的聚类质量和聚类效率,提出簇的直径、簇间的相似度这2个概念。利用距离尺度降维的中心距序降维法,将多维数据降至一维,在一维上利用自适应排序聚类算法ASC聚类。该算法和传统的Cobweb算法、K-means算法做对比,实验表明该方法能提高簇间相似度,最大提高200%。 Aiming to improve the clustering quality and efficiency on banking services datasets, this paper proposes the concepts of cluster diameter and the similarity measurement between clusters. It modifies multi-dimensional data to one dimension by dimension reduction based on distance order. It clusters the one dimension data with a self-Adaptive Sort Clustering(ASC) algorithm. This paper conducts extensive experiments to show that this algorithm can improve the cluster similarity and reduce the clustering time compared with Cobweb and K-means algorithms. The cluster similarity can be approximately improved by 200%.
出处 《计算机工程》 CAS CSCD 北大核心 2010年第12期58-60,63,共4页 Computer Engineering
基金 国家自然科学基金资助项目(60773169) 贵州省科技厅自然科学基金资助项目(黔科合J字[2010]) 遵义市科技局自然科学基金资助项目(遵市科合社字[2009]27号)
关键词 簇直径 簇间相似度 ASC算法 中心距序降维 cluster diameter cluster similarity self-Adaptive Sort Clustering(ASC) algorithm dimension reduction by center distance order
  • 相关文献

参考文献6

二级参考文献14

  • 1Rakesh A,Johannes G,Dimitrios G,et al.Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications[C]//Proc.of SIGMOD'98.Washington,USA:ACM Press,1998.
  • 2Lance P,Ehtesham H,Huan L.Subspace Clustering for High Dimensional Data:A Review[C]//Proc.of SIAM'04.New York,USA:ACM Press,2004.
  • 3Karin K,Hans-p K,Peer K.Density-connected Subspace Clustering for High Dimensional Data[C]//Proc.of SIAM'04.New York,USA:ACM Press,2004.
  • 4Sudipto G,Rajeev R,Kyuseok S.ROCK:A Robust Clustering Algorithm for Categorical Attributes[C]//Proc.of ICDE'99.[S.1.]:IEEE Computer Society,1999.
  • 5Xu Xiaowei,IEEE Trans Knowledge Engineering,1998年,2卷
  • 6Zhang T,Proc the ACMSIGMOD Conference on Management of Data,1996年
  • 7Keim D.A.. Information virsualization and visual data mining. IEEE Transactions on Visualization and Computer Graphics, 2002, 8(1): 1~8
  • 8Qian Wei-Ning, Gong Xue -Qing, Zhou Ao -Ying. Clustering in very large database based on distance and density. Journal of Computer Science and Technology, 2003, 18(1): 67~76
  • 9Kaufman L., Rousseeuw P.J.. Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley & Sons, 1990
  • 10Ng R., Han J.W.. Efficient and effective clustering method for spatial data mining. In: Proceedings of the 20th Internationa1 Conference on Very Large Data Base. Santiago, Chile, 1994, 144~155

共引文献21

同被引文献13

  • 1杨春宇,周杰.一种混合属性数据流聚类算法[J].计算机学报,2007,30(8):1364-1371. 被引量:22
  • 2于剑 肖宇.聚类分析.中国计算机学会通讯,2009,1518(8):23-29.
  • 3YU Jian. General c - means clustering model [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelli- gence,2005,27 ( 8 ) : 1 197-1 211.
  • 4FREY B J, DUECK D. Clustering by passing messages between data points [ J ]. Science, 2007,315 ( 5814 ) : 972 -976.
  • 5HEYER L J, KRUGLYAK S, YOOSEPH S. Exploring ex- pression data:Identification and analysis of eoexpressed genes[ J]. Genome Research,9 : 1 106-1 115.
  • 6GIRVAN M, NEWMAN M E J. Community structure in social and biological networks [ J ]. Proc Natl Acxcad Sci USA ,99,2002:7 821-7 826.
  • 7DHILLON I S, MALLELA S, MODHA D S. Information - theoretic co - clustering[ C ]. Proceedings of the 26th Annual International ACM SIGIR Conference on Re- search and Development in Information Retrieval,2003: 89-98.
  • 8David Olson,Yong Shi.商业数据挖掘导论[M].北京:机械工业出版社,2007,44-45.
  • 9向剑平,唐常杰,陈瑜,李川,左劼,胡进军.δ-KCLR:基于优化初始簇的聚类算法及其应用[J].四川大学学报(自然科学版),2009,46(4):924-928. 被引量:1
  • 10向剑平,唐常杰,陈瑜,胡进军,左劼,易树鸿.基于动力学聚类技术的银行信贷风险挖掘[J].计算机工程与设计,2009,30(14):3478-3480. 被引量:1

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部