期刊文献+

一种处理混合型数据的层次聚类算法 被引量:1

Dealing with mixed type data by hierarchical clustering algorithm
下载PDF
导出
摘要 针对字符型数据和混合型数据的聚类方法进行了研究。首先在经典粗糙集理论的基础上,通过松弛对象之间的不可分辨和相容性条件,得到了基于和谐关系的扩展粗糙集模型;然后定义了新的个体间不可区分度、类间不可区分度、聚类结果的综合近似精度等概念,提出了新的混合数据类型层次聚类算法。该算法不仅能处理数值型数据,而且能处理大多数聚类算法不能处理的字符型数据和混合型数据。实验验证了算法的可行性。 This paper presented a new clustering method which could deal with mixed type data. Firstly, proposed an extended rough sets model based on concordance relation which relaxed the indiscernibility relation and tolerance relation. Secondly, redefined some concepts, such as the indiscernibility degree between two objects, the indiseernibility degree between two clus- ters, integrated approximation rate of the clustering result. Then, proposed a new hierarchical clustering algorithm to deal with mixed data. The algorithm not only could deal with the numerical type data as the other algorithms, but also could deal with the character type data and mixed type data. The experiment shows the method is feasible.
出处 《计算机应用研究》 CSCD 北大核心 2009年第8期2885-2887,共3页 Application Research of Computers
基金 国家自然科学基金资助项目(60573068) 重庆市教委科学技术研究资助项目(KJ080510) 重庆邮电大学科研基金资助项目(A2004-46)
关键词 粗糙集 聚类 和谐关系 不可区分度 综合近似精度 rough set clustering concordance relation indiscernibility degree integrated approximation rate
  • 相关文献

参考文献5

二级参考文献37

  • 1余建桥,张帆.基于数据场改进的PAM聚类算法[J].计算机科学,2005,32(1):165-167. 被引量:15
  • 2王珏,苗夺谦,周育健.关于Rough Set理论与应用的综述[J].模式识别与人工智能,1996,9(4):337-344. 被引量:264
  • 3苗夺谦.Rough Set理论在机器学习中的应用研究:博士学位论文[M].北京:中国科学院自动化研究所,1997..
  • 4Vapnik V N.统计学习理论的本质(中文版)[M].北京:清华大学出版社,2000..
  • 5黄萱菁.大规模中文文本的检索、分类与摘要研究:博士学位论文[M].上海:复旦大学,1998..
  • 6[2]Jianwei Han, M Kamber. Data Mining: Concepts and Techniques. San Francisco: Morgan Kaufmann Publishers, 2000
  • 7[3]J Grabmeier, A Rudolph. Techniques of cluster algorithms in data mining. Data Mining and Knowledge Discovery, 2002, 6(4): 303~360
  • 8[4]A K Jain, M N Murty, P J Flynn. Data clustering: A review. ACM Computing Surveys, 1999, 31(3): 264~323
  • 9[5]J MacQueen. Some methods for classification and analysis of multivariate observations. In: L M Le Cam, J Neyman eds. Proc of the 5th Berkeley Symp on Mathematics, Statics and Probability, Vol 1. Berkeley: Berkeley University of California Press, 1967. 281~298
  • 10[6]J C Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. New York: Plenum Press, 1981

共引文献250

同被引文献11

  • 1赵宇,李兵,李秀,刘文煌,任守榘.混合属性数据聚类融合算法[J].清华大学学报(自然科学版),2006,46(10):1673-1676. 被引量:9
  • 2http://archive.ics.uci.edu/ml/datasets/.
  • 3Cheeseman P, Stutz J. Bayesian classification (AutoClass) ; Theory and results Advances in Knowledge Discovery and Data Mining [ M ]. AAAI Press/The MIT Press, 1996 : 153 - 180.
  • 4Li C,Biswas G. Unsupervised learning with Mixed Numeric and Nominal Data [J]. IEEE Trans. Knowl. Data Eng. ,2002,14(4):673 - 690.
  • 5Goodall D W. A New Similarity Index Based On Probability [J]. Biometrics , 1966,22:882 - 907.
  • 6He Z,Xu X,Deng S. Clustering Mixed Numeric and Categorical Data: A Cluster Ensemble Approach [ OL ]. eprint arXiv : cs/0509011,2005.
  • 7Ng A, Jordan M, Weiss Y. On Spectral Clustering:Analysis and an algorithm [C]// NIPS. Vancouver, British Columbia, Canda: MIT Press ,2001:849 - 856.
  • 8Meila M,Xu L. Multiway cuts and spectral clustering[R]. U. Washington Tech Report,2003.
  • 9Huang Z. Clustering Large Data Sets with Mixed Numeric and Categorical Values [C]// Proceedings of the 1^st Pacific-Asia Conference on Knowledge Discovery and Data Mining, (PAKDD). Singapore, 1997:21 - 34.
  • 10蔡晓妍,戴冠中,杨黎斌.谱聚类算法综述[J].计算机科学,2008,35(7):14-18. 被引量:189

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部