摘要
引入序关系保持的思想,即层次聚类的簇间距离度量应该能够最大限度地维护样本点间的原始距离排序关系。定义了样本点对序关系的概念和序关系损失度量,证明了序关系损失度量可用做聚类的目标准则函数和聚类结果质量的评价标准。利用序关系损失的概念扩展出两种簇间距离度量,实现了基于序关系保持的层次聚类算法(order-preserving based hierarchical clustering algorithm,OPHCLUS)。实验仿真证明了OPHCLUS对聚类质量提升的有效性。
The idea of maintaining order relation was proposed,i.e.,the original order of distance between samples should be preserved by the inter-cluster measurement of hierarchical clustering as far as possible.Based on this idea,the notion of order relation of sample's pair and the loss measurement of order relation was defined,which could be used as the objective criteria function of clustering and the validity standard of consequent clusters.Furthermore,two kinds of distance measurement from the loss of order relation were extended,i.e.,inter-cluster adjusted distance and inter-cluster 0-1 weighted distance,and an order-preserving based hierarchical clustering algorithm was implemented by using these two measurements.Experimental simulation demonstrated the improvement in the clustering quality.
出处
《山东大学学报(工学版)》
CAS
北大核心
2010年第5期48-55,共8页
Journal of Shandong University(Engineering Science)
基金
国家高技术研究发展计划(863计划)资助项目(2006AA12Z217)
中国矿业大学科技基金资助项目(OD080313)
关键词
层次聚类算法
序关系保持
簇间修正距离
簇间0-1加权距离
hierarchical clustering algorithm
maintaince of order relation
inter-cluster adjusted distance
inter-cluster 0-1 weighted distance