期刊文献+

一种基于概率的快速聚类算法 被引量:2

A Kind of Fast Clustering Algorithm Based on Probability
下载PDF
导出
摘要 在聚类算法和特征向量维数确定的模式样本集中,各样本的每一维表示一个对应特征;鉴于此在基于层次算法的基础上,提出了一种基于概率的快速聚类算法;该算法先对各个特征进行分类,然后按照概率准则,每个向量先自成一类,将其对应概率最大的特征向量合并,减少类别数,直至达到要求为止;用UCI中的Iris和Wine数据集对该算法进行仿真实验,实验数据表明:用该算法进行聚类,能获得较好的聚类结果,说明算法具有一定的有效性。 In clustering algorithms,in model samples determined by eigenvector dimensions, every dimension of each sample represents a corresponding feature,based on this, this paper advances a kind of fast clustering algorithm based on probability on the basis of hierarchical algorithm.This algorithm firstly classifies each feature, then according to probability principle, makes each vector become a type, combines the maximum eigenvectors with its corresponding probability to reduce the type number until the requirement is met, and conducts simulation experiment on this algorithm by using Iris and Wine data set in UCI.Experiment data show that better clustering results can be obtained by using this algorithm for clustering ,which illustrates that this algorithm has certain validity.
作者 李婧
出处 《重庆工商大学学报(自然科学版)》 2014年第2期61-65,共5页 Journal of Chongqing Technology and Business University:Natural Science Edition
关键词 聚类 样本 特征 概率 clustering sample feature probability
  • 相关文献

参考文献10

  • 1PAN J W,MICHELINE K. Data Mining:Concepts and Techniques[M].San Francisco:Morgan Kaufmann Publishers,2001.412-413.
  • 2KAUFAN L,ROUSSEEUW P. Finding Groups in Data:an Introduction to Cluster Analysis[M].{H}New York:John Wiley and Sons,Inc,1990.
  • 3MUATA K,BRYSO O. Towards Supporting Expert Evaluation of Clustering Results Using a Data Mining Process Model[J].{H}Information Sciences,2010,(03):414-431.
  • 4ESTER M,KRIEGEL H,SANDER J,XU X. A Density Based Algorithm for Discovering Cluster in Large Spatial Databases with Noise[A].Portland:AAAI Press,1996.226-231.
  • 5AGRAWAL R,GEHRKE J,GUNOPOLOS D. Automatic Subspace Clustering of High Dimensional Data for Data Mining Application[A].Seattle:ACM Press,1998.94-105.
  • 6朱明.数据挖掘[M]{H}合肥:中国科学技术大学出版社,2002.
  • 7EISEN M,SPELLMAN P,BROWN P. Cluster Analysis and Display of Genome-wide Expression Data[J].Proceedings of National Academy of Science USA,1988,(95):14863-14868.
  • 8AMADOR J. Sequential Clustering by Statistical Methodology[J].{H}Pattern Recognition Letters,2005,(26):2152-2163.
  • 9李有明.一种基于参考点的快速k-均值算法[J].重庆工商大学学报(自然科学版),2013,30(6):39-43. 被引量:3
  • 10韩凌波,王强,蒋正锋,郝志强.一种改进的k-means初始聚类中心选取算法[J].计算机工程与应用,2010,46(17):150-152. 被引量:94

二级参考文献9

共引文献95

同被引文献26

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部