摘要
在研究零售户聚类分析中,传统的k中心聚类方法,计算成本过大,无法有效应用于大数据集.提出了零售户聚类方法,继承CLARANS算法迭代思想,采用全局随机抽样技术,将算法应用于大型空间数据集,通过多次迭代尽量寻求最优聚类结果.聚类结果的评价标准为基于最短主干道距离(SARD)的总距离.该聚类算法是在CLARANS算法的基础上进行改进,使其能够处理带地理信息的数据对象,且聚类结果满足需求约束条件限制.
In the study of retailer cluster analysis,the traditional k center cluster method can not be used effectively for large data sets because of too much computation.A method of retailer cluster analysis based on the CLARANS iterative algorithm is proposed and the global random sampling technique is used in this method to deal with the large spatial data sets.Optimal cluster results may be obtained through several iterations.An evaluation criterion of the cluster results is the total distance that based on the Shortest Arterial Road Distance(SARD).The cluster algorithm is improved based on the CLARANS algorithm and can be used to process data with geographic information,and its results can meet the demand constraint conditions.
出处
《内蒙古大学学报(自然科学版)》
CAS
CSCD
北大核心
2012年第3期306-312,共7页
Journal of Inner Mongolia University:Natural Science Edition
基金
国家自然科学基金资助项目(71172168)
关键词
聚类算法
最短主干道距离
差异度
cluster algorithm
the shortest arterial road distance(SARD)
variability