摘要
粗糙K-means算法中下近似和边界区域权重系数的设置对算法的聚类效果有着重要的影响。传统的粗糙K-means算法及很多改进的粗糙K-means算法对所有类簇的下近似和边界区域设置固定的权重,忽视了簇内数据对象分布差异性的影响。针对这个问题,根据下近似和边界区域的数据对象相对于类簇中心的空间分布情况,提出一种新的基于空间距离自适应权重度量的粗糙K-means算法。该算法在每次迭代过程中,根据每个类簇的下近似和边界区域的数据对象相对于类簇中心的平均距离,综合度量下近似和边界区域对于类簇中心迭代计算的不同重要程度,动态地计算下近似和边界区域的相对权重系数。通过实例验证及实验仿真证明了所提算法的有效性。
The setting of weights coefficient of lower approximation and boundary area in rough K-means algorithm has an important influence on final clustering results of algorithm.However,traditional rough K-means and many refined rough K-means algorithms set up fixed weights of lower approximations and boundary area for all clusters,ignoring the effect of distribution difference of data objects within clusters.To cope with this problem,a new rough K-means algorithm with self-adaptive weights measurement based on space distance was proposed according to the spatial distribution of objects in lower approximation and boundary area relative to the cluster centers.During each iteration process,different importance of lower approximation and boundary area on iterative computation of cluster centers was measured based on average distance of objects in lower approximation and boundary area relative to cluster centers and the relative weights coefficient of lower approximation and boundary area were dynamically calculated.The validity of the algorithm was verified by experimental analysis.
作者
王慧研
张腾飞
马福民
WANG Hui-yan;ZHANG Teng-fei;MA Fu-min(College of Automation,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;College of Information Engineering,Nanjing University of Finance and Economics,Nanjing 210023,China)
出处
《计算机科学》
CSCD
北大核心
2018年第7期190-196,共7页
Computer Science
基金
国家自然科学基金项目(61403184)
江苏省高校自然科学研究重大项目(17KJA120001)
江苏省"青蓝工程"基金(QL2016)
南京邮电大学"1311人才计划"基金(NY2013)
南京邮电大学科研项目基金(NY215149)资助