摘要
在空间聚类中,最佳聚类数K求解的关键是构造合适的聚类有效性函数.典型K-平均算法中的聚类数K必须是事先给定的确定值,然而,实际中K很难被精确地确定,使得该算法对一些实际问题无效.文章提出距离代价函数作为最佳聚类数的有效性检验函数,建立了相应的数学模型,并据此设计了一种新的K值优化算法.同时,给出了K值最优解KOPT及其上界KMAX的条件,在理论上证明了经验规则KMAX≤N的合理性,实例结果进一步验证了新方法的有效性.
In spatial clustering, the key factor to solve the problem of optimal class number is to construct a proper cluster validity function. The value of k must be confirmed in advance to exert K-means algorithm. However, it can not be clearly and easily confirmed in fact for its uncertainty, This paper recommends a distance cost function based on Euclidean distance to confirm the optimal class number, sets np a corresponding math model and designs a flew optimization algorithm of k value. At the same time, the conditions of optimal solution kopt and its up limit k are presented in this paper. The experiential rule which is usually expressed as kmax≤√n is theoretically proved to be reasonable. Results come from the example also show the validity of this new algorithm.
出处
《系统工程理论与实践》
EI
CSCD
北大核心
2006年第2期97-101,共5页
Systems Engineering-Theory & Practice
基金
国家自然科学基金(70471046)
国家教育部博士学科点基金(20040359004)
关键词
空间聚类
K-平均算法
距离代价函数
k值优化
spatial clustering
K-means algorithm
distance cost function
optimization of k