期刊文献+

基于离群点识别的聚类结果属性特征簇发现 被引量:1

Discovering attribute features of a cluster for any clustering result based on an outlier detection technique
下载PDF
导出
摘要 对聚类结果的理解有助于评价聚类效果,可以据此调整聚类过程,更高效地使用聚类结果.但是,聚类结果的理解仍然是一个尚未解决的问题.提出了基于离群点识别技术分析任意聚类算法的聚类结果,发现了聚类结果属性特征簇的方法;提出一种基于不相似性比值的离群点识别算法.通过对全部数据簇的属性描述进行离群点分析,发现各数据簇的特征属性,实现对聚类结果的理解.所提方法适用于任意聚类算法结果的分析.对UCI的iris、ZOO和Housing数据集的采用X-means、Frozen和DBScan算法的聚类结果进行聚类结果分析,实验表明所提方法较成功地发现了不同聚类算法的属性特征簇,有助于对聚类结果的深入理解. Understanding of clustering results is still an open problem, yet this understanding is critical for the evaluation and usage of them. This paper proposes a means to discover the attribute features of any cluster that was derived using the outlier detection method. Based on our novel outlier detection method, the paper analyzes the attribute features of the obtained clusters, and then returns the results. The proposed method can be adapted for any clustering algorithm. An experiment was conducted on the clustering results of the algorithms X-means, Frozen and DBScan for the UCI datasets iris, ZOO and Housing. The proposed algorithm was shown to achieve good performance in understanding the clustering results of different clustering methods.
出处 《哈尔滨工程大学学报》 EI CAS CSCD 北大核心 2009年第3期312-317,共6页 Journal of Harbin Engineering University
基金 高等学校博士学科点基金资助项目(20070217043) 哈尔滨工程大学基础研究基金资助项目(HEUFT05007)
关键词 聚类 属性特征簇 数据簇分析 离群点识别 clustering attribute features cluster cluster analysis outlier detection
  • 相关文献

参考文献11

  • 1HANJ KAMBERM 范明 孟小峰译.数据挖掘概念与技术[M].北京:机械工业出版社,2001..
  • 2PELLEG D, MOORE A. X-means: extending K-means with efficient estimation of the number of clusters [ C ]// Proc 2000 Int Conf on Data Mining. San Francisco, 2000: 727- 734.
  • 3SPRENGER T C, BRUNELLA R. GROSS M H. H-Blob:A hierarchical visual clustering method using implicit surfaces [ C ]// Proceedings of Visualization. Salt Lake City, USA, 2000: 61-68.
  • 4NAKAMURA T. Feature extraction of clusters based on flexdice [ C ]// Proceedings of the 21st International Conference on Data Engineering Workshops. Tokyo, Japan. 2005 : 1126-1130.
  • 5KNORR E, NG R. Finding intensional knowledge of distance-based outliers [ C]//Proc of the VLDB Conf. Edinburgh: Morgan Kaufmann Publishers. San Fransisco, USA, 1999:211-222.
  • 6RAMASWAMY S, RASTOGI R, SHIM K. Efficient algorithms for mining outliers from large data sets [ C ]// Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. Dallas, USA, 2000 : 427-438.
  • 7BREUNIGM M, KRIEGE1H P, NGRT, et al. LOF: identifying density-based local outliers [ C ]// Proceedings of the ACM SIGMOD International Conference on Management of Data. Dallas : ACM Press, 2000:93-104.
  • 8HETTICH S, BLAKE C L, MERZ C J. UCI Repository of machine learning databases [ EB/OL ]. (2007-12-09). ht tp ://www. ics. uci. edu/-mlearn/MLRepository, html.
  • 9FALOUTSOS C, LINK. FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets [ C]//Proeeedings of the 1995 ACM SIGMOD International Conference on Management of Data. San Jose: ACM Press, 1995 : 163-174.
  • 10FRED A L N, LEIT A, O J M N. A new cluster Isolation criterion based on dissimilarity increments [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25 ( 8 ) :944-958.

共引文献44

同被引文献6

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部