摘要
目的针对K-means法倾向于产生大小相等的球状类这一缺点,对K-means法进行改进,使其在对方差大小不等的类进行聚类时,可以达到较好的效果。方法以修正后的方差的倒数为权重,对欧氏距离的平方进行加权处理,从而用"相对距离"代替"绝对距离"来计算样品点与类间的相似度。结果在对方差大小不等的2个类进行聚类时,改进K-means法得到的正确率高于传统的K-means法。结论在对方差相差悬殊的两类进行聚类时,改进的K-means法优于传统的K-means法。
Objective The purpose of this dissertation is to propose a developed K-means method, which is more effective than traditional K-means method especially when identifying clusters whose variances are unequal. Methods The relative distance but not absolute distance was used to calculate the distance between the individual and the cluster center. Relative distanee, as what is called, is defined as the ratio between the squared Euclidean distance and the adjusted variance of the cluster. Results When identifying clusters whose variances are unequal, the developed K-means method may lead to a higher accuracy evaluated with actual clusters. Conclusion The developed K-means method is more effective than traditional K-means method when identifying clusters whose variances are unequal.
出处
《中国医院统计》
2008年第1期9-12,共4页
Chinese Journal of Hospital Statistics
基金
广东省科技计划项目(2004B33701010)
关键词
聚类分析
欧氏距离
加权
Cluster analysis
Euclidean distance
Weighting