期刊文献+

基于半监督近邻传播的数据流聚类算法 被引量:1

Data stream clustering algorithm based on semi-supervised affinity propagation
下载PDF
导出
摘要 为了提高进化数据流的聚类质量,提出基于半监督近邻传播的数据流聚类算法(SAPStream),该算法借鉴半监督聚类的思想对初始数据流构造相似度矩阵进行近邻传播聚类,建立在线聚类模型,随着数据流的进化,应用衰减窗口技术对聚类模型适时做出调整,对产生的类代表点和新到来的数据点再次聚类得到数据流的聚类结果。对数据流进行动态聚类的实验结果表明该算法是高质有效的。 In order to improve the clustering quality of evolving data stream, this paper introduces a new data stream clustering algorithm, clustering over data Stream based on Semi-supervised Affinity Propagation (SAPStream), this algorithm calculates the similarity matrix of the initial data with the idea of semi-supervised, executes AP cluster, and then builds online clustering model. With the evolution of the data stream, the clustering model adjusts using decay windows technology, and the data stream clustering results are got by executing cluster again over the exemplars and new arrival data points. SAPStream can analyze and deal with large-scale evolving data stream. Its performance is tested by using both real datasets and synthetic datasets. Experi- mental results show this algorithm achieves a higher quality of clustering.
作者 王文帅 陈刚
出处 《计算机工程与应用》 CSCD 2013年第8期6-8,47,共4页 Computer Engineering and Applications
基金 国家自然科学基金资助重点项目(No.90912004)
关键词 数据流 半监督 近邻传播聚类 衰减窗口 data stream semi-supervised affinity propagation clustering decay windows
  • 相关文献

参考文献10

  • 1Zhu Yunyue, Dennis Shasha.Statstream: statistical monitoringof thousands of data streams in real time[C]//VLDB.Hongkong:[s.n.],2002:358-369.
  • 2Aggarwal C C,Han J, Wang J, et al.A framework for clusteringevolving data streams[C]//VLDB.Berlin : [s.n.], 2003 : 81 -92.
  • 3朱琳,刘晓东,朱参世.基于衰减滑动窗口数据流聚类算法研究[J].计算机工程与设计,2012,33(7):2659-2662. 被引量:6
  • 4Cao F,Ester M,Qian W,et al.Density-based clustering overan evolving data stream with noise[C]//2006 SIAM Conferenceon Data Mining.Bethesda,MD : [s.n.],2006 : 326-337.
  • 5胡睿,林昭文,柯宏力,马严.一种基于密度和滑动窗口的数据流聚类算法[J].计算机科学,2011,38(5):145-148. 被引量:12
  • 6Frey B J, Dueck D.Clustering by passing messages betweendata points[J].Science,2007,315(5814) :972-976.
  • 7Zhang X, Furtlehner C, Sebag M.Data streaming with affinitypropagation[C]//ECML/PKDD .Berlin : Springer-Verlag, 2008 :628-643.
  • 8王开军,张军英,李丹,张新娜,郭涛.自适应仿射传播聚类[J].自动化学报,2007,33(12):1242-1246. 被引量:144
  • 9Wagstaff K, Cardie C.Clustering with instance-level con-straints[C]//Proc of the 17th lnt,1 Conf on Machine Learning(ICML 2000).Stanford:Morgan Kaufmann Publishers,2000:1103-1110.
  • 10肖宇,于剑.基于近邻传播算法的半监督聚类[J].软件学报,2008,19(11):2803-2813. 被引量:165

二级参考文献18

  • 1常建龙,曹锋,周傲英+.基于滑动窗口的进化数据流聚类[J].软件学报,2007,18(4):905-918. 被引量:61
  • 2Frey B J, Dueck D. Clustering by passing messages between data points. Science, 2007, 315(5814): 972-976
  • 3Kelly K. Affinity program slashes computing times [Online], available: http://www.news.utoronto.ca/bin6/070215-2952. asp. October 25, 2007
  • 4Dudoit S, Fridlyand J. A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biology, 2002, 3(7): 1-21
  • 5Wang K J. Supplement of adaptive affinity propagation clustering [Online], available: http://www.mathworks. com/matlabcentral/fileexchange/loadAut hor .do?object Type =author&objectId=1095267, October 25, 2007
  • 6Velamuru P K, Renaut R A, Guo H B, Chen K W. Robust clustering of positron emission tomography data. In: Joint Interface CSNA. USA: 2005
  • 7Dembele D, Kastner P. Fuzzy C-means method for clustering microarray data. Bioinformatics, 2003, 19(8): 973-980
  • 8Strehl A. Relationship-based Clustering and Cluster Ensembles for High-dimensional Data Mining [Ph. D. dissertation], The University of Texas at Austin, 2002
  • 9Blake C L, Merz C J. UCI repository of machine learning databases (University of California) [Online], available:http://mlearn.ics.uci.edu/MLRepository.html, September 27, 2007
  • 10Ben H A, Guyon I, Elisseeff A. A stability based method for discovering structure in clustered data. In: Proceedings of the 7th Pacific Symposium on Biocomputing. Hawaii, USA: 2002. 6-17

共引文献294

同被引文献9

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部