期刊文献+

基于子空间集成的概念漂移数据流分类算法 被引量:5

Classification Algorithm for Concept-Drifting Data Stream Based on Subspace Integration
下载PDF
导出
摘要 具有概念漂移的复杂结构数据流分类问题已成为数据挖掘领域研究的热点之一。提出了一种新颖的子空间分类算法,并采用层次结构将其构成集成分类器用于解决带概念漂移的数据流的分类问题。在将数据流划分为数据块后,在每个数据块上利用子空间分类算法建立若干个底层分类器,然后由这几个底层分类器组成集成分类模型的基分类器。同时,引入数理统计中的参数估计方法检测概念漂移,动态调整模型。实验结果表明:该子空间集成算法不但能够提高分类模型对复杂类别结构数据流的分类精度,而且还能够快速适应概念漂移的情况。 The classification of concept-drifting data streams with complex category structures has recently becomes one of the most popular topics in data mining. This paper proposes a novel subspace classification method, and uses it to form an ensemble classifier in a hierarchical structure for concept-drifting data streams classification. After dividing a given data stream into several data blocks, it uses the subspace classification method to train some bottom classifiers on each data block, and then uses these bottom classifiers to form a base classifier. The base classifers are used to build the ensemble classifier. Meanwhile, it introduces the parameter estimation method to detect concept drift. Experimental results show that the proposed method does not only significantly improve the classification performance on datasets with complex category structures, but also quickly adapts to the situation of concept drift.
作者 李南 郭躬德
出处 《计算机系统应用》 2011年第12期240-248,共9页 Computer Systems & Applications
基金 福建省高校产学合作重大项目资助(2010H6007) 教育部留学回国人员基金(教外司留[2008]890号)
关键词 概念漂移 数据流 子空间 分类 集成 concept drift data stream subspace classification integration
  • 相关文献

参考文献22

  • 1Widmer G Kubat M. Learning in the presence of concept drift and hidden contexts.Machine Learning,1996,23(l): 69-101.
  • 2Tsymbal A, Pechenizkiy M, Cunningham P, et al. Dynamic integration of classifiers for handling concept drift. Information Fusion, 2008,9(1):56-68.
  • 3Hanen LK, Salamon E Neutral network ensemble. IEEE Trans. on Pattern Anaylsis and Machine Intelligence, 1990, 12(10):993 -1001.
  • 4Street W, Kim Y, A streaming ensemble algorithm (SEA) for large-scale classification. Proc. of 7th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining KDD-2001. New York: ACM Press, 2001:77-382.
  • 5Wang H, Fan W, Yu P, et al. Mining concept drifting data streams using ensemble classifiers. Proc. of 9th International Conference on Knowledge Discovery and Data Mining, Washington DC, 2003:226-235.
  • 6胡学钢,潘春香.基于实例加权方法的概念漂移问题研究[J].计算机工程与应用,2008,44(21):188-191. 被引量:5
  • 7孙岳,毛国君,刘旭,刘椿年.基于多分类器的数据流中的概念漂移挖掘[J].自动化学报,2008,34(1):93-97. 被引量:28
  • 8费洪晓,戴弋,穆珺,黄勤径,罗桂琼.基于优化时间窗的用户兴趣漂移方法[J].计算机工程,2008,34(16):210-211. 被引量:12
  • 9富春岩,葛茂松.一种能够适应概念漂移变化的数据流分类方法[J].智能系统学报,2007,2(4):86-91. 被引量:5
  • 10Agrawal R, Gehrke J, Gunopulos, et al. Automatic subspace clustering of high dimensional data for data mining applications. Proc. of ACM SIGMOD Conference on Management of Data, New York: ACM Press, 1998.94-105.

二级参考文献102

  • 1李国正,杨杰,孔安生,陈念贻.基于聚类算法的选择性神经网络集成[J].复旦学报(自然科学版),2004,43(5):689-691. 被引量:15
  • 2陈友,程学旗,李洋,戴磊.基于特征选择的轻量级入侵检测系统[J].软件学报,2007,18(7):1639-1651. 被引量:78
  • 3Golab L,Ozsn M T.Issnes in data stream management[C]//Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data,California USA,June 2003.New York:ACM Press, 2003,32(2) :5-14.
  • 4Zhu Yunyue,Shasha D.Efficient elastic burst detection in data streams[C]//Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data mining,Washington USA, August 2003.New York :ACM Press,2003,336-345.
  • 5Babcock B,Babu S,Datar M,et al.Models and issues in data stream systems[C]//Proceedings of 21st ACM Symposium on Principles of Database Systems,Madlson Wisconsin USA, 2002.New York:ACM Press,2002:1-16.
  • 6Domingos P,Hulten G.Mining high-speed data streams[C]//Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Diseovery and Data mining,Boston USA,2000.New York: ACM Press, 2000: 71-80.
  • 7Hulten G,Spencer L,Domingos P.Mining time-changing data streams[C]//Proceedings of the 7th ACM SIGKDD International Conference on Kn,wledge Discovery and Data Mining,San Francisco USA, 2001.New Y,rk: ACM Press, 2001, 97-106.
  • 8Wang H,Fan Wei,Yu P,et al.Mining concept-drifting data streams using ensemble elassifiers[C]//Proceedings of 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington,August 2003.New York:ACM Press,2003:226-235.
  • 9Chu F,Zaniolo C.Fast and light boosting for adaptive mining of data streams[M].Berlin/Heidelberg:Springer, 2004:282-292.
  • 10Kolrer J Z,Marcus.A dynamic weighted majofity:a new ensemble method for tracking concept drift[C]//Proceedings of the 3rd IEEE Conference on Data Mining,Los Alamitos CA,Noventber 2003.CA: JEEE Press, 2003 : 123-130.

共引文献50

同被引文献90

  • 1MASUD M M, GAO J, KHAN L, et al. Mining concept-drifting data stream to detect peer to peer botnet traffic[EB/OL].[2012-01-04]. http://www.utdallas.edu/~mmm058000/reports/UTDCS-05-08.pdf.
  • 2CRUPI V, GUGLIEMINO E, MILAZZO G. Neural-network-based system for novel fault detection in rotating machinery[J].Journal of Vibration and Control, 2004, 10(8): 1137-1150.
  • 3DELANY S J, CUNNINGHAM P, TSYMBAL A. A comparison of ensemble and case-base maintenance techniques for handing concept drift in spam filtering[C] // FLAIRS'2006: Proceedings of 19th International Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2006: 340-345.
  • 4MASUD M M, GAO J, KHAN L, et al. A practical approach to classify evolving data streams: Training with limited amount of labeled data[C] // ICDM '08: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining. Washington, DC: IEEE Computer Society, 2008:929-934.
  • 5WIDMER G,KUBAT M.Learning in the presence of concept drift and hidden contexts[J] .Machine Learning,1996,23(1):69-101.
  • 6HO S-S, WECHSLER H. A martingale framework for detecting changes in data streams by testing exchangeability[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(12):2113-2127.
  • 7HULTEN G, SPENCER L, DOMINGOS P. Mining time-changing data streams[C] // KDD '01: Proceedings of the seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2001: 97-106.
  • 8DIETTERICH T G, BARKIRI G. Solving multiclass learning problems via error-correcting output codes[J].Artificial Intelligence Research, 1995, 2(1): 263-286.
  • 9STREET W N, KIM Y S. A Streaming Ensemble Algorithm (SEA) for large-scale classification[C] // KDD '01: Proceedings of 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2001: 377-382.
  • 10WANG H, FAN W, YU P S, et al. Mining concept drifting data streams using ensemble classifiers[C] // KDD '03: Proceedings of 9th International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2003: 226-235.

引证文献5

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部