摘要
传统k-means初始中心随机选取,在较大范围内,利用以流形距离为相似度测度的参数不能较好地反映数据集的全局一致性。为此,基于属性划分和弧形距离,提出一种层次聚类算法。依据粒计算中属性划分思想和最大最小距离法则选择初始阶段的类代表点,根据k-means进行粗聚类。采用新的距离测度,即弧形距离和反映类内相似度大类间相似度小的准则函数,对初阶段类代表点聚类归类得到期望类代表点。每个数据点依据其类代表点的类标签信息找到自己所属的类标签。实验结果表明,与其他算法相比,该算法较好地体现数据集的全局一致性,减少了运行时间。
Aiming at resolving the problems of the traditional k-means algorithm random selecting of initial clustering centers,having the flaw of the global consistency on the large scale whose parameters are based on manifold distance as the measure of the similarity.A hierarchical clustering algorithm based on attribute partitioning and curve distance is proposed.It is based on the attribute partitioning ideological of granular computing and max-min distance method selects initial cluster centers and makes the crude clustering by k-means to get early stage exemplars.According to new distance measure,that is curve distance and criterion function.The big similarity within class and smaller similarity between class does cluster classification to get expect exemplars.Each data points are assigned through the labels of their corresponding representative exemplars.Experimental results show that the algorithm has the good global consistency to the data set,and the running time is reduced.
出处
《计算机工程》
CAS
CSCD
北大核心
2015年第8期174-179,共6页
Computer Engineering
基金
湖南省自然科学基金资助项目(14JJ7043)
湖南省交通运输厅科技进步与创新基金资助项目(201405)
关键词
弧形距离
属性划分
最大最小距离
聚类归类
类标签
curve distance
attribute partitioning
max-min distance
cluster classification
class lable