摘要
针对传统对象间相似度的度量方法的缺陷,提出了一种改进的网格相似度聚类算法。该算法引入了新的相似度度量标准,并使用密度阈值处理技术来定义网格的密度阈值,提高了聚类的精度。同时还利用信息熵的概念对高维数据集进行了处理,对数据集的维度具有很好的扩展性。在与传统算法的对比实验中显示出一定的优越性。
Aimed at the limitations of traditional measurement method on similitude among objects, an improved grid simi- larity -based clustering algorithm is put forward. It draws a new criterion to measure the similitude among objects, and defines the density threshold of grid by processing technology of density threshold to improve the precision of clustering. Besides, high dimensional data set is disposed by the technique of information entropy, it has the advantage over the tradi- tional clustering algorithm.
出处
《军事交通学院学报》
2010年第3期77-80,共4页
Journal of Military Transportation University
关键词
网格相似度
数据集
熵
聚类算法
grid similitude
data set
entropy
clustering algorithm