摘要
K-Means算法随机选择聚类中心初始点,导致聚类器性能不稳定。对此,提出基于可变阈值的初始聚类中心选择方法(VTK-Means)。该算法选择距已有初始点距离大于一个阈值的样例作为初始聚类中心,并根据满足条件的初始聚类中心个数适当调整阈值。在10个UCI数据集上的实验结果表明,该算法性能明显优于K-Means算法。
The K-Means algorithm selects the initial clustering centers randomly,which results in the performance of the clustering instability.In order to improve the limitation,a novel clustering algorithm(VTK-Means) based on variable threshold to select initial cluster centers is proposed in this paper.The algorithm tries to select the points whose distances to the existing initial points are longer than a threshold as the initial cluster centers, and then it appropriately adjusts the threshold accord- ing to the number of the points meeting the condition in the first step.The experimental results on UCI machine learning data sets indicate that it yields better stability compared with the typical K-means algorithm.
出处
《计算机工程与应用》
CSCD
北大核心
2011年第32期56-58,共3页
Computer Engineering and Applications
基金
山东省科技研究计划项目(No.2007ZZ17
No.2008GG10001015
No.2008B0026)
山东省教育厅科研项目(No.J09LG02)