摘要
为了充分挖掘隐藏在样本向量中的空间信息和知识信息:用聚类点代替类均值,把提取指标对聚类所做贡献的量化值定义为指标分类权;用分类权定义样本点与聚类点的加权距离,使之作为样本与类之间的相似性度量更具合理性,即将加权距离转化为样本隶属度.为了消除序贯算法产生的随机性,用样本的K类隶属度作为点质量的样本质点组的质心,修正当前的K类聚类点,由此建立基于分类权和质心驱动的搜索聚类点的迭代算法.IRIS数据检验结果表明,新算法的聚类效果与稳定性都优于已有的无监督学习方法.
In order to find space information and knowledge in sample points: when clustering point replaces classmean clustering, the quantized value that describes index contribution to clustering is abstracted, then index classification weight is defined. By using classification weight, weighted distance between sample point and clustering point is defined. As similarity measurement between sample point and class, this distance is more reasonable. Transform weighted distance into sample membership. In order to avoid randomicity caused by sequential algorithm, the mass center of the sample point set is utilized to modify the present clustering points of K classes and the sample points use K memberships as their masses. From this, an iterative algorithm based on classification weight and mass center driving for searching clustering points is proposed. IRIS is used to verify this algorithm and the result shows that clustering effect and stability are superior to the existing unsupervised learning algorithms.
出处
《自动化学报》
EI
CSCD
北大核心
2009年第5期526-531,共6页
Acta Automatica Sinica
基金
国家自然科学基金(60474019)
河北省自然科学基金(F2005000482)资助~~
关键词
无监督数据
聚类点聚类
分类权
加权距离
质心
Unsupervised data, clustering method based on clustering point, classification weight, weighted distance, mass center