期刊文献+

分类属性高维数据基于集合差异度的聚类算法

Clustering algorithm based on set dissimilarity for high dimensional data of categorical attributes
原文传递
导出
摘要 提出基于集合差异度的聚类算法.算法通过定义的集合差异度和集合精简表示,直接进行一个集合内所有对象总体差异程度的计算,而不必计算两两对象间的距离,并且在不影响计算精确度的情况下对分类属性高维数据进行高度压缩,只需一次数据扫描即得到聚类结果.算法计算时间复杂度接近线性.实例表明该算法是有效的. A clustering algorithm is proposed based on set dissimilarity. Through defining set dissimilarity and set reduction, it does not calculate the distance between each pair of objects but computes the general dissimilarity of all the objects in a set directly, re- duces high-dimensional categorical data enormously without loss of computation accuracy and gets the clustering result by only once data scanning. The time complexity of the algorithm is almost linear. An example of real data shows that the clustering algorithm is effective.
出处 《北京科技大学学报》 EI CAS CSCD 北大核心 2010年第8期1085-1089,共5页 Journal of University of Science and Technology Beijing
基金 国家自然科学基金资助项目(No.70771007)
关键词 聚类 高维空间 集合 差异度 数据挖掘 clustering high-dimensional space sets dissimilarity data mining
  • 相关文献

参考文献3

二级参考文献80

  • 1Watts D J, Strogatz SH. Collective dynamics of Small-World networks. Nature, 1998,393(6638):440-442.
  • 2Barabasi AL, Albert R. Emergence of scaling in random networks. Science, 1999,286(5439):509-512.
  • 3Barabasi AL, Albert R, Jeong H, Bianconi G. Power-Law distribution of the World Wide Web. Science, 2000,287(5461):2115a.
  • 4Albert R, Barabasi AL, Jeong H. The Internet's Achilles heel: Error and attack tolerance of complex networks. Nature, 2000, 406(2115):378-382.
  • 5Girvan M, Newman MEJ. Community structure in social and biological networks. Proc. of the National Academy of Science, 2002,9(12):7821-7826.
  • 6Guimera R, Amaral LAN. Functional cartography of complex metabolic networks. Nature, 2005,433(7028):895-900.
  • 7Palla G, Derenyi I, Farkas I, Vicsek T. Uncovering the overlapping community structures of complex networks in nature and society. Nature, 2005,435(7043):814-818.
  • 8Wilkinson DM, Huberman BA. A method for finding communities of related genes. Proc. of the National Academy of Science, 2004,101(Suppl.1):5241-5248.
  • 9Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D. Defining and identifying communities in networks. Proc. of the National Academy of Science, 2004,101 (9):2658-2663.
  • 10Palla G, Barabasi AL, Vicsek T. Quantifying social group evolution. Nature, 2007,446(7136):664-667.

共引文献220

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部