摘要
传统的k_means算法将欧式距离作为最常用的距离度量方法.针对基于欧式距离计算样本点与类间相似度的不足,用"相对距离"代替"绝对距离"可以更好地反映样本的实际分布,提出一种在领域知识未知的情况下基于加权欧式距离的k_means算法.针对公共数据库UCI里的数据实验表明改进后的算法能产生质量较高的聚类结果.
Euclid distance is commonly used to measure distance in the traditional k_means algorithm.The k_means algorithm based on weighted Euclid distance is researched and presented to overcome the existing problems of similarity calculation in clustering analysis based on traditional Euclid distance when we have no any domain knowledge about the data objects,the relative distance but not absolute distance is more accurately response to data distribution.Experiments on the standard database UCI show that the proposed method can produce a high accuracy clustering result.
出处
《郑州大学学报(工学版)》
CAS
北大核心
2010年第1期89-92,共4页
Journal of Zhengzhou University(Engineering Science)
基金
兰州市企业技术攻关计划资助(2009-1-4)
兰州交通大学"青蓝"人才工程基金资助(QL-05-10A)
关键词
k_means算法
聚类
加权
变异系数
k_means algorithm
clustering
weight
coefficient of variation