期刊文献+

流数据概念漂移的检测算法 被引量:16

Detecting algorithm of concept drift from stream data
原文传递
导出
摘要 鉴于流数据具有实时、连续、有序和无限等特点,使用近似方法便可检测连续分时段的流数据序列,基于此,运用目标分布数据,结合相似分布理论,提出了利用Tr-OEM算法对流数据中的概念漂移现象进行检测.该算法能够动态地判断流数据概念漂移的发生,自适应地优化概念漂移的检测值,适用于不同类型的流数据.通过分析和实验仿真可以表明,该算法在处理流数据概念漂移时具有较好的适应性. Based on the stream data with the characters such as real-time, continuous, orderly and unlimited, the continuous- time data sequence can be detected by using the approximate method. Based on this, making use of samples not only from the target distribution but also from similar distributions, Tr-OEM algorithm is proposed to detect the concept drift phenomenon in stream data. This algorithm dynamically estimates the occurrence of concept drift in stream data, automatically determines optimizing or reconstructing classifiers, and is applied to different types of stream data. The analysis and simulation experiments Show that the proposed algorithm has better adaptability while handling the concept drift in stream data.
作者 张杰 赵峰
出处 《控制与决策》 EI CSCD 北大核心 2013年第1期29-35,共7页 Control and Decision
基金 中国博士后基金项目(20100481284) 全国统计科研计划重点项目(2011LZ048) 山东省优秀中青年科学家科研奖励基金项目(BS2012SF024)
关键词 流数据 概念漂移 检测 数据挖掘 stream data concept drift detecting data mining
  • 相关文献

参考文献14

  • 1Domingos P, Hulten G. Mining high-speed data streams[C]. Proc of ACM Sigkdd Int Conf Knowledge Discovery in Databases. Boston: ACM Press, 2000: 71-80.
  • 2Hulten G, Spencer L, Domingos E Mining time-changing data streams[C]. Proc of ACM Sigkdd Int Conf Knowledge Discovery in Databases. San Francisco: ACM Press, 2001: 97-106.
  • 3Wang H, Fan W, Yu P, et al. Mining concept drifting data streams using ensemble classifiers[C]. The 9th ACM lnt Conf on Knowledge Discovery and Data Mining. Washington: ACM Press, 2003: 226-235.
  • 4Tom Mitchell. Machine learning[M]. McGraw Hill, 1997: 123_12~6.
  • 5Zico Kolter J, Marcus A Maloof. Dynamic weighted majority: An ensemble method for drifting concepts[J]. J of Machine Learning Research, 2007, 8(8): 2755-2790.
  • 6陈照阳,黄上腾.流数据分类中的概念漂移问题研究[J].计算机应用与软件,2009,26(2):254-256. 被引量:12
  • 7Li C Q, L!ng T W, Hu M. Efficient processing of updates in dynamic XML data[C]. Proc of the 22nd Int Conf on Data Engineering. Washington DC: IEEE Computer Society, 2006: 13-22.
  • 8Li C Q, Ling T W, Hu M. Efficient updates in dynamic XML data: From binary string to quaternary string[J]. The Very Large Data Bases J, 2008, 17(3): 573-601.
  • 9李敏,王勇,蔡立军.数据流分类中的增量特征选择算法[J].计算机应用,2010,30(9):2321-2323. 被引量:5
  • 10Saerens M, Latinne P, Decaestecker C. Adjusting the outputs of a classifier to new a priori probabilities: A simple procedure[J], Neural Computation, 2002, 14(1): 21- 41.

二级参考文献17

  • 1南煜,寇晓蕤,王清贤.一种新型远程网络拓扑发现及分析算法[J].计算机应用,2005,25(2):248-251. 被引量:5
  • 2陈友,程学旗,李洋,戴磊.基于特征选择的轻量级入侵检测系统[J].软件学报,2007,18(7):1639-1651. 被引量:78
  • 3Eric Rosenberg. Hierarchical topological network design[ J ]. IEEE/ ACM Transactions on Networking ,2005,13 (6) :402 - 1409
  • 4Gerffrion A M. An improved implicit enumeration approach for Integer Programming[J]. Operations Research, 1969,17 (3) :437 - 454.
  • 5Gurumohan P C, Hui J. Topology design for free space optical networks, proceedings of the 12th Int. Conf. Computer Communications and Networks,2003:576 -579.
  • 6GERHARD W, MIROSEAV K. Learning in the presence of concept drift and hidden contexts [J]. Machine Learning, 1996, 23(1) : 69 -101.
  • 7YUE XUN, MO HONGWEI, CHI ZHONGXIAN. Immune-inspired incremental feature selection technology to data streams [ J]. Applied Soft Computing, 2008, 8(2): 1041-1049.
  • 8KATAKIS I, TSOUMAKAS G, VLAHAVAS I P. On the utility of incremental feature selection for the classification of textual data streams [ C]//PCI 2005: Proceedings of the 10th Panhellenic Conference on Informatics, LNCS 3746. Berlin: Springer, 2005: 338- 348.
  • 9KATAKIS I, TSOUMAKAS G, VLAHAVAS I. Dynamic feature space and incremental feature selection for the classification of textual data streams [ C]// European Conference on Machine Learning/ Practice of Knowledge Discovery in Databases - 2006 International Workshop on Knowledge Discovery from Data Streams. Berlin: [ s.n.], 2006: 107-116.
  • 10LIU H, YU L. Towards integrating feature selection algorithms for classification and clustering [ J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(4): 491-502.

共引文献15

同被引文献84

引证文献16

二级引证文献41

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部