期刊文献+

考虑加权排序的分类数据聚类算法 被引量:2

Clustering algorithm of categorical data in consideration of sorting by weight
原文传递
导出
摘要 针对部分聚类算法对数据输入顺序敏感的问题,定义了不干涉序列指数,提出了应用不干涉序列指数对分类数据进行加权排序的方法,并基于该方法对受数据输入顺序影响的CABOSFV C分类数据高效聚类算法进行改进,提出了考虑加权排序的聚类算法(CABOSFV CSW),消除了算法对数据输入顺序的敏感性.采用UCI基准数据集进行实验,发现应用加权升序排序的CABOSFV CSW算法在处理分类数据时,聚类质量较原始CABOSFV C算法和其他受数据输入顺序影响的算法在准确性上有改善,在稳定性上有显著提高. Aimed at solving the problem that part of clustering algorithms are sensitive to the data input order, a non-interference sequence index was defined, and an approach applying the non-interference sequence was proposed to sort categorical data by weight. Based on this approach, a new clustering algorithm considering sorting by weight (CABOSFV_CSW) was presented to improve CABOSFV^C, which is an efficient clustering algorithm for categorical data but sensitive to the data input order. This approach eliminates sensitivity to the data input order. UCI benchmark data sets were used to compare the proposed CABOSFV_CSW algorithm with traditional CABOSFV_C algorithm and other algorithms sensitive to the data input order. Empirical tests show that the new CABOSFV_CSW clustering algorithm for categorical data improves the accuracy and increases the stability effectively.
出处 《北京科技大学学报》 EI CAS CSCD 北大核心 2013年第8期1093-1098,共6页 Journal of University of Science and Technology Beijing
基金 国家自然科学基金资助项目(71271027) 中央高校基本科研业务费专项(FRF-TP-10-006B)
关键词 数据挖掘 聚类算法 排序 分类数据 data mining clustering algorithm sorting categorical data
  • 相关文献

参考文献2

二级参考文献10

共引文献92

同被引文献21

  • 1JIANG Sheng-yi,WANG Lian-xi. Unsupervised feature selectionbased on clustering[C] //Proc of the 5th IEEE International Conference on Bio-Inspired Computing:Theories and Applications. 2010:263-270.
  • 2MITRA P,MURTHY C,PAL S. Unsupervised feature selection using similarity[J].IEEE Trans on Pattern Analysis and Machine Intelligence,2002,24(3):301-312.
  • 3IENCO D,MEO R. Exploration and reduction of the feature space by hierarchical clustering[C] //Proc of SIAM Conference on Data Mi-ning. 2008:577-587.
  • 4WITTEN D,TIBSHIRNI R. A framework for feature selection in clustering[J].Journal of the American Statistical Association,2010,105(490):713-726.
  • 5LIU Hua-wen,WU Xin-dong,ZHANG Shi-chao. Feature selection using hierarchical feature clustering[C] //Proc of the 20th ACM International Conference on Information and Knowledge Management. New York:ACM Press,2011:979-984 .
  • 6ZHAO Xi,DENG Wei,SHI Yong. Feature selection with attributes clustering by maximal information coefficient[J].Procedia Compu-ter Science,2013,17:70-79.
  • 7BANDYOPADHYAY S,BHADRA T,MITRA P,et al. Integration of dense subgraph finding with feature clustering for unsupervised feature selection[J].Pattern Recognition Letters,2014,40(4):104-112.
  • 8JIANG Sheng-yi,SONG Xiao-yu,WANG Hui,et al. A clustering-based method for unsupervised intrusion detections[J].Pattern Recognition Letters,2006,27(7):802-810.
  • 9HALL M A. Correlation-based feature selection for categorical and numeric class machine learning[C] //Proc of the 17th International Conference on Machine Learning. 2000:359-366.
  • 10DASH M,LIU Huan. Consistency-based search in feature selection[J].Artificial Intelligence,2003,151(1-2):155-176.

引证文献2

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部