摘要
单个聚类方法得到的结果会存在不稳定性等问题,为了克服这些问题,本文在证据理论(又称为信任函数理论)的基础上提出了一种新的聚类集成方法.多数情况下,聚类集成方法主要包含2个关键步骤:得到一组基划分,以及结合基划分得到最终聚类结果,本文的方法重点考虑第2步.在第1步得到基划分之后,将其转换成一种中间表示,可以称这种中间表示为关系表示.在证据理论中,我们认为得到的关系表示是不可靠的,可以用折扣过程对关系表示进行预处理,然后就可以用不同的结合法则融合关系表示.从融合后的关系表示中提取信任矩阵或似然矩阵,将其视为样本间的互相关矩阵.为了能够充分利用样本间的传递性,将得到的互相关矩阵视为一个模糊关系,对其做传递闭包处理,从而得到一个模糊等价关系.将模糊的等价关系视为新的相似性数据,用能够处理相似性数据的聚类方法得到最终的结果.通过实验,表明了该聚类集成方法的稳定性和有效性.
To overcome the instability of one single clustering result,we propose a new clustering ensemble method based on Dempster-Shafer theory ( also known as belief function theory). In general,ensemble methods consist of two principal steps: generating base partitions and combining them into a single one;our method mainly focuses on the second step.After obtaining the base partitions in the first step,we convert them into an intermediate interpretation,which can be called a relational representation.We believe that the evidence source from the relational representations may be doubtful,which can be fixed by using the discounting process in belief function theory.After discounting the relational representations,we can combine them in the evidential level by different combination rules. Then,we can obtain the belief matrix or plausibility matrix from the fused relational representation,which can be seen as a co-association matrix between objects.To make full use of the transitive property between objects,we treat this co-association matrix as a fuzzy relation and make it the transitive closure to yield a fuzzy equivalence relation. The final partition is obtained by applying some clustering algorithms to the new co-association matrix.The experimental results show the stability and efficiency of our method.
作者
李锋
李寿梅
Thierry Denoeux
LI Feng;LI Shoumei;Denoeux Thierry(College of Applied Sciences,Beijing University of Technology,Beijing 100124;Centre National de la Recherche Scientifique,Sorbonne Universités, Universitéde Technologie de Compiègne,Heudiasyc ( UMR 7253),France)
出处
《南京信息工程大学学报(自然科学版)》
CAS
2019年第3期332-339,共8页
Journal of Nanjing University of Information Science & Technology(Natural Science Edition)
基金
国家自然科学基金(11571024)
2018年北京工业大学研究生外培计划
关键词
证据理论
聚类集成
关系表示
互相关矩阵
传递闭包
belief function
clustering ensemble
relational representation
co-association matrix
transitive closure