摘要
在聚类算法和特征向量维数确定的模式样本集中,各样本的每一维表示一个对应特征;鉴于此在基于层次算法的基础上,提出了一种基于概率的快速聚类算法;该算法先对各个特征进行分类,然后按照概率准则,每个向量先自成一类,将其对应概率最大的特征向量合并,减少类别数,直至达到要求为止;用UCI中的Iris和Wine数据集对该算法进行仿真实验,实验数据表明:用该算法进行聚类,能获得较好的聚类结果,说明算法具有一定的有效性。
In clustering algorithms,in model samples determined by eigenvector dimensions, every dimension of each sample represents a corresponding feature,based on this, this paper advances a kind of fast clustering algorithm based on probability on the basis of hierarchical algorithm.This algorithm firstly classifies each feature, then according to probability principle, makes each vector become a type, combines the maximum eigenvectors with its corresponding probability to reduce the type number until the requirement is met, and conducts simulation experiment on this algorithm by using Iris and Wine data set in UCI.Experiment data show that better clustering results can be obtained by using this algorithm for clustering ,which illustrates that this algorithm has certain validity.
出处
《重庆工商大学学报(自然科学版)》
2014年第2期61-65,共5页
Journal of Chongqing Technology and Business University:Natural Science Edition
关键词
聚类
样本
特征
概率
clustering
sample
feature
probability