期刊文献+

基于交叠数据窗距离测度概念漂移检测新方法 被引量:5

Concept drift detection based on distance measurement of overlapped data windows
下载PDF
导出
摘要 针对数据流中的概念漂移检测存在错误检测、延迟检测等问题,提出了一种基于交叠数据窗距离测度的在线概念漂移检测方法。通过将数据流划分成大小相等且交叠的数据窗并计算相邻交叠数据窗异构欧氏距离,同时利用近邻原则判别数据窗中样本不一致程度,从而实现分布差异性评价和漂移的检测。为评价该方法的有效性,在具有不同漂移严重程度和漂移速度的公开数据集上进行了实验,实验结果表明:该方法能够准确快速地检测到不同类型的概念漂移且能够找出概念漂移发生的具体位置。 To solve the false detection and detection delay of concept drift for data stream, a new online concept drift detection method based on the distance measurement of overlapped data windows was proposed in this paper. By dividing the data stream into overlapped data windows and computing the heterogeneous Euclidean distance of neighboring windows, and measuring the inconsistency of the data windows through the nearest neighbor principle, the authors could achieve the evaluation of distribution diversity and the detection of concept drift. To evaluate the effectiveness of the proposed method, experiments were made on some public data sets with different drift severity and drift speed. The experimental results show that the proposed method can detect different types of concept drift quickly and accurately and can figure out the locations where concept drift appeared.
出处 《计算机应用》 CSCD 北大核心 2014年第2期542-545,549,共5页 journal of Computer Applications
基金 国家自然科学基金资助项目(60835004) 湖南省教育厅科研项目(10B109) 湖南省重点学科建设项目
关键词 概念漂移 数据流 异构欧氏距离 交叠数据窗 concept drift data stream heterogeneous Euclidean distance overlap data window
  • 相关文献

参考文献16

  • 1FOLINO G,PIZZUTI C,SPEZZANO G. Mining distributed evolving data streams using fractal GP ensembles[A].{H}Berlin:Springer-Verlag,2007.160-169.
  • 2GABER M M,YU P S. Classification of changes in evolving data streams using online clustering result deviation[EB/OL].http://citeseerx.ist.psu.edu/viewdoc/download?doi =10.1.1.89.6882&rep =repl &type =pdf,2012.
  • 3张杰,赵峰.流数据概念漂移的检测算法[J].控制与决策,2013,28(1):29-35. 被引量:16
  • 4KATAKIS I,TSOUMAKAS G,VLAHAVAS I. Tracking recurring contexts using ensemble classifiers:an application to email filtering[J].{H}Knowledge and Information Systems,2010,(3):371-391.
  • 5KUNCHEVA L. Change detection in streaming multivariate data using likelihood detectors[J].{H}IEEE Transactions on Knowledge and Data Engineering,2011,(5):1175-1180.
  • 6李南,郭躬德,陈黎飞.基于少量类标签的概念漂移检测算法[J].计算机应用,2012,32(8):2176-2181. 被引量:7
  • 7BAENA-GARCIA M,del CAMPO-AVILA J,FIDALGO R. Early drift detection method[A].{H}Berlin:Springer-Verlag,2006.77-86.
  • 8ALIPPI C,ROVERI M. Just-in-time adaptive classifiers,Part Ⅰ:detecting nonstationary changes[J].{H}IEEE Transactions on Neural Networks,2008,(7):1145-1153.
  • 9ALIPPI C,BORACCHI G,ROVERI M. An effective just-in-time adaptive classifier for gradual concept drifts[A].Piscataway:IEEE Press,2011.1675-1682.
  • 10GAMA J,MEDAS P,CASTILLO G. Learning with drift detection[A].{H}Berlin:Springer-Verlag,2004.286-295.

二级参考文献45

  • 1MASUD M M, GAO J, KHAN L, et al. Mining concept-drifting data stream to detect peer to peer botnet traffic[EB/OL].[2012-01-04]. http://www.utdallas.edu/~mmm058000/reports/UTDCS-05-08.pdf.
  • 2CRUPI V, GUGLIEMINO E, MILAZZO G. Neural-network-based system for novel fault detection in rotating machinery[J].Journal of Vibration and Control, 2004, 10(8): 1137-1150.
  • 3DELANY S J, CUNNINGHAM P, TSYMBAL A. A comparison of ensemble and case-base maintenance techniques for handing concept drift in spam filtering[C] // FLAIRS'2006: Proceedings of 19th International Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2006: 340-345.
  • 4MASUD M M, GAO J, KHAN L, et al. A practical approach to classify evolving data streams: Training with limited amount of labeled data[C] // ICDM '08: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining. Washington, DC: IEEE Computer Society, 2008:929-934.
  • 5WIDMER G,KUBAT M.Learning in the presence of concept drift and hidden contexts[J] .Machine Learning,1996,23(1):69-101.
  • 6HO S-S, WECHSLER H. A martingale framework for detecting changes in data streams by testing exchangeability[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(12):2113-2127.
  • 7HULTEN G, SPENCER L, DOMINGOS P. Mining time-changing data streams[C] // KDD '01: Proceedings of the seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2001: 97-106.
  • 8DIETTERICH T G, BARKIRI G. Solving multiclass learning problems via error-correcting output codes[J].Artificial Intelligence Research, 1995, 2(1): 263-286.
  • 9STREET W N, KIM Y S. A Streaming Ensemble Algorithm (SEA) for large-scale classification[C] // KDD '01: Proceedings of 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2001: 377-382.
  • 10WANG H, FAN W, YU P S, et al. Mining concept drifting data streams using ensemble classifiers[C] // KDD '03: Proceedings of 9th International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2003: 226-235.

共引文献21

同被引文献33

引证文献5

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部