期刊文献+

高维数据流的聚类离群点检测算法研究 被引量:2

The Study on Clustering-Based Outlier Detection Algorithm for High-Dimensional Data Stream
下载PDF
导出
摘要 针对基于聚类的离群点检测算法在处理高维数据流时效率和精确度低的问题,提出一种高维数据流的聚类离群点检测(CODHD-Stream)算法.该算法首先采用滑动窗口技术对数据流划分,然后通过属性约简算法对高维数据集降维;其次运用基于距离的信息熵过滤机制的K-means聚类算法将数据集划分成微聚类,并检测微聚类的离群点.通过实验结果分析表明:该算法可以有效提高高维数据流中离群点检测的效率和准确度. The existing clustering-based outlier detection suffers from low efficiency and precision when dealing with high-dimensional data stream. To relieve this problem,an algorithm of clustering-based outlier detection for high-dimensional data stream( CODHD-Stream) was presented. The algorithm used sliding window technology to divide the data stream. Then dimensions of high-dimensional data streams were reduced by an attribute reduction algorithm. Finally,it divided the data set into a number of micro-clustering to detect outliers contained in the micro-clustering by the K-means method of the distance-based information entropy mechanism. The experimental analyses show that the proposed algorithm can effectively raise the speed and accuracy of outlier detection in high-dimensional data stream.
作者 程艳 苗永春
出处 《江西师范大学学报(自然科学版)》 CAS 北大核心 2014年第5期449-453,共5页 Journal of Jiangxi Normal University(Natural Science Edition)
基金 国家社科基金教育学青年课题"教育虚拟社区的群集智能化构建方法研究"(CCA110109) 国家自然科学基金地区基金(61262080)资助项目
关键词 高维数据流 滑动窗口 属性约简 K-均值 微聚类 信息熵 离群点检测 high-dimensional data stream sliding window attribute reduction K-means micro-clustering informa-tion entropy outlier detection
  • 相关文献

参考文献11

  • 1Wu Xindong, Zhu Xingquan, Wu Gongqing, et al. Data mining with big data [ J ]. Knowledge and Data Engineer- ing ,2014 ,26( 1 ) :97-107.
  • 2Wang Changdong, Lai Jianghuang, Huang Dong, et al. SVStream:a support vector-based algorithm for clustering data streams [ J ]. IEEE Transactions on Knowledge and Data Engineering,2013,25 (6) : 1410-1424.
  • 3Albanese A, Pal S K, Petrosino, A. Rough sets, kernel set, and spatiotemporal outlier detection [ J]. Knowledge and Data Engineering ,2014,26( 1 ) : 194-207.
  • 4Kollios G, Gunopulos D, Koudas N, et al. Efficient biasedsampling for approximate clustering and outlier detection in large data sets [ J ]. Knowledge and Data Engineering, 2003,15(5) :1170-1157.
  • 5Charalampidis D. A modified k-means algorithm for circu- lar invariant clustering [ J ]. Pattern Analysis and Machine Intelligence, 2005,27 ( 12 ) : 1856 -1865.
  • 6Kanungo Tapas, Mount D M, Netanyahu N S, et al. An effi- cient k-means clustering algorithm: analysis and imple- mentation [ J ]. Pattern Analysis and Machine Intelli- gence, 2002,24 (7) : 881-892.
  • 7Yip A M,Ding C, Chan T F. Dynamic cluster formation u- sing level set methods [ J ]. Pattern Analysis and Machine Intelligence, 2006,28 (6) : 877-889.
  • 8Guha S, Meyerson A, Mishra N, et al. Clustering data streams:Theory and practice [ J ]. Knowledge and Data Engineering,2003,15 ( 3 ) :515-528.
  • 9Jiang Feng, Sui Yuefei, Cao Cungen. An information entro- py-based approach to outlier detection in rough sets [ J ]. Expert Syst App1,2010,37 ( 1 ) :6338-6344.
  • 10Kapoor R, Gupta R. Non-linear dimensionality reduction u- sing fuzzy lattices [ J ]. lET Computer Vision, 2013,7 (3) : 201-208.

同被引文献14

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部