摘要
聚类分析是数据挖掘的研究热点。传统的聚类算法都是把一个对象精确地划分到一个聚类簇中,类别之间的界限是非常精确的。随着Web挖掘技术的发展,精确地划分每个对象的聚类算法面临着巨大的挑战。根据数据场理论和经典粗糙集理论所具有处理不精确与不确定性数据的特性,提出一种新的基于数据场的粗糙聚类算法,该粗糙聚类算法采用势值作为对象的划分依据,避免传统粗糙聚类算法一贯采用基于欧氏距离的划分方法。算法首先通过对数据对象进行粗分然后再不断迭代细分,直至形成稳定的聚类簇。实验分析过程中,把提出的算法与粗糙K-means算法和粗糙K-medoids算法进行了比较,结果表明该算法在交叉数据集上具有较好的聚类效果,而且收敛速度较快。
Clustering analysis is the hotspot in Data mining, all the conventional clustering algorithms precisely put the each object into one cluster, the bounders between clusters are precise, as the development of the Web mining, clustering algorithms that precisely divide each object face great challenges. Based on the data field theory and classic rough set theory's character that processes the uncertainty and imprecise data, a novel rough clustering algorithm based on data field was proposed, it divides the objects through computing potential value, which avoids the conventional rough clustering partition method based on euclidean distance. The approach iterates from rough to un-rough incessantly till the stable clusters form. At the experimental analysis process, we compared the algorithm that we proposed with rough K- means algorithm and rough K-medoids algorithm, the result shows the algorithm that we proposed has better clusters on the crossed datasets and fast convergence.
出处
《计算机科学》
CSCD
北大核心
2009年第2期203-206,244,共5页
Computer Science
基金
国家自然科学基金资助项目(60475019
60775036)
2006年博士学科点专项科研基金(20060247039)资助