期刊文献+

基于属性依赖关系和对象相关性的自然聚类算法 被引量:1

Natural Clustering Algorithm Based on Attributes Dependency and Objects Correlation
下载PDF
导出
摘要 针对数据集中属性间存在依赖关系以及对象间存在相关性,定义了一种新的相似关系模型,该模型所描述的相似关系能够体现对象之间的自然相关性.在此基础上提出一种基于属性依赖关系和对象相关性的自然聚类算法,该聚类算法在不事先指定聚类数目的情况下,将所有相似性达到设定阈值的对象自然聚为一类;当调整相似性阈值时,该算法还可实现不同粒度的聚类.通过分别对数值型数据集和分类型数据集进行实验比较分析,结果表明这种自然聚类算法与其他聚类算法相比,能够真实反映数据间的相关性以及数据集的自然簇结构,同时可以发现任意形状的簇,有效地提高了聚类的精度和质量. In this paper, taking into account that there exists attribute dependency and object correlation of the data sets, we proposed a novel similarity relation model in which the similarity relation is able to reflect the natural relationship between the objects. And from this we presented a natural clustering algorithm based on attributes dependency and objects correlation. It is able to group the data ob- jects into different cluster automatically under the similarity threshold without specifying the number of clusters at the beginning. Addi- tionally by adapting the similarity threshold the algorithm can group the objects into clusters on different granularity. Experimental re- suits show that comparing to other clustering algorithms it can better identify the natural cluster structure of data objects with the exper- iments on the numeric data sets and on the category data sets. Meanwhile ,it can also discover clusters of arbitrary shape. In tests of the algorithm we find that it has obvious advantages in accuracy and quality.
出处 《小型微型计算机系统》 CSCD 北大核心 2015年第4期810-814,共5页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(61272109)资助
关键词 属性依赖关系 对象相关性 相似度 目标函数 自然聚类 attribute dependency object correlation similarity object function natural clustering
  • 相关文献

参考文献3

二级参考文献59

  • 1陈宗海,文锋,聂建斌,吴晓曙.基于节点生长k-均值聚类算法的强化学习方法[J].计算机研究与发展,2006,43(4):661-666. 被引量:13
  • 2Han Jiawei,Kamber M.Data Mining Concepts and Techniques[M].San Francisco:Morgan Kaufmann,2001.
  • 3Brendan J F,Delbert D.Clustering by passing messages between data points[J].Science,2007,315(16):972-976.
  • 4Zhang Jiangshe,Liang Yiuwing.Improved possibilistic c-means clustering algorithms[J].IEEE Trans on Fuzzy Systems,2004,12(2):209-217.
  • 5Mac Q J.Some methods for classification and analysis of multivariate observation[C]//Proc of the 5th Berkley Symp on Mathematical Statistics and Probability.Berkley,California:University of California Press,1967:281-297.
  • 6Huang Zhexue.Clustering large data sets with mixed numeric and categorical values[C]//Proc of PAKDD97.Singapore:World Scientific,1997:21-35.
  • 7Huang Zhexue.Extensions to the K-means algorithm for clustering large data sets with categorical values[J].Data Mining and Knowledge Discovery,1998,2(3):283-304.
  • 8Ng M K,Li Junjie,Huang Zhexue,et al.On the impact of dissimilarity measure in K-modes clustering algorithm[J].IEEE Trans on Pattern Analysis and Machine Intelligence,2007,29(3):503-507.
  • 9San O M,Huynh V N,Nakamori Y.An alternative extension of the K-means algorithm for clustering categorical data[J].Int Journal Application Mathematic and Computer Science,2004,14(2):241-247.
  • 10Li Cen,Biswas G.Unsupervised learning with mixed numeric and nominal data[J].IEEE Trans on Knowledge and Data Engineering,2002,14(4):673-690.

共引文献56

同被引文献13

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部