期刊文献+

基于最大频繁项集信息熵的数据流变化检测 被引量:1

Online Detection of Data Stream Changes Based on Maximum Frequent Itemset Entropy
下载PDF
导出
摘要 应用最大频繁项集信息熵来进行数据流变化检测.采用了一种新的数据流差异度度量方法;提出了一种新的有效挖掘最大频繁项集的算法;给出了应用最大频繁项集信息熵进行数据流变化检测的算法.最后,对算法的时间效率和空间效率进行了分析. Online detection of data stream changes is a new topic in data stream studies, which provides a salient feature compared to other types of data mining. In this paper, a novel method for detection and estimation of data stream changes is proposed. The main concerns include: 1 ) adoption of a novel discrepancy measure for data streams, 2) a new algorithm which can effectively explore and store all maximum frequent itemsets for data streams, and 3 ) a method for detection of changes based on maximum frequent itemsets information entropy. No previous work has been reported to the authors' best knowledge using maximum frequent itemsets entropy model in detecting data stream changes. Experiments were carried out to study temporal and spatial efficiency of the algorithm.
出处 《应用科学学报》 CAS CSCD 北大核心 2006年第5期498-502,共5页 Journal of Applied Sciences
基金 江苏省高技术项目(BG2004034) 江苏省2004年度研究生创新计划项目(xm04-36)
关键词 数据流 最大频繁项集 变化检测 数据流分析 data stream maximum frequent itemsets change detection data stream analysis
  • 相关文献

参考文献11

  • 1DONG G,et al.Online mining of changes from data streams:research problems and preliminary results[C]//Proceedings of the 2003 ACM SIGMOD Workshop on Management and Processing of Data Streams.San Diego:2003:225-236.
  • 2BEN-DAVID S,GEHRKE J,KIFER D.Detecting change in data streams[C]//Proceedings of the 30 th VLDB Conference.Toronto:2004:180-191.
  • 3MA Junshui,PERKINS S.Online novelty detection on temporal sequences.The ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD).Washington:2003:24-27.
  • 4AGGARWAL C C.A framework for diagnosing changes in evolving data streams[C]//The ACM International Conference on Management of Data (SIGMOD).San Diego:2003:576-586.
  • 5ZHU Yunyue,SHASHA D.Efficient elastic burst detection in data streams[C]//The ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD).Washington:DC:2003:336-345.
  • 6CORMODE G,MUTHUKRISHNAN S.What is new:finding significant differences in network data streams[C]//Proceedings of IEEE INFOCOM 2004.Hong Kong:2004:1534-1545.
  • 7宋国杰,唐世渭,杨冬青,王腾蛟.数据流中异常模式的提取与趋势监测[J].计算机研究与发展,2004,41(10):1754-1759. 被引量:19
  • 8LI T,OGIHARA M,ZHU S.Similarity testing between heterogeneous basket databases[D].Technical Report 781,Computer Science,Univ of Rochester,2002.
  • 9LI T,ZHU S,OGIHARA M.A new distributed data mining model based on similarity[C]//Proceedings of the 18th Annual ACM Symposium on Applied Computing (SAC ' 03).Florida:Melbourne,2003:432-436.
  • 10杨明,孙志挥.一种基于最大加权频繁项目集的数据库相似性判别算法[J].计算机研究与发展,2004,41(10):1774-1779. 被引量:1

二级参考文献20

  • 1Rakesh Agrawal, Ramakrishnan Srikant. Fast algorithms for mining association rules. The 20th Int' l Conf on Very Large Data Bases, Santiago, Chile, 1994
  • 2J Han, J Pei, Y Yin. Mining frequent Patterns without candidate generation. In: Proc of the 2000 ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 2000
  • 3Ramakrishnan Srikant, Rakesh Agrawal. Mining sequential patterns: Generalizations and performance improvements. In:Peter M GApers, Mokrane Bouzeghoub, Georges Gardarin, eds.In: Proc of the 5th Int'l Conf Extending Database Technology,LNCS 1057. Berlin: Springer-Verlag, 1996. 3~17
  • 4J Pei, J Han, B Mortazavi-Asl, et al. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth.The 2001 Int'l Conf on Data Engineering (ICDE' 01 ),Heidelberg, Germany, 2001
  • 5Li Tao, M Ogihara, Zhu Shenghuo. Similarity testing between heterogeneous basket datasets. Computer Science Department,University of Rochester, Tech Rep: 781, 2002
  • 6Li Tao, Zhu Shenghuo, M Ogihara. A new distributed data mining model Based on Similarity. In: Proc of the 2003 ACM Symp on Applied Computing. New York: ACM Press, 2003. 432~ 436
  • 7G Das, H Mannila, P Ronkainen. Similarity of attributes by external probes. In: Proc of the 4th Int'l Conf on Knowledge Discovery and Data Mining. Los Alamitos, CA: IEEE Computer Society Press, 1998. 23 ~ 29
  • 8R Goldman, N Shivakumar, S Venkatasubramanian, et al.Proximity search in databases. In: Proc of the 24th Int'l Conf on Very Large Databases. San Francisco: Morgan Kaufmann, 1998.26 ~ 37
  • 9S Parthasarathy, M Ogihara. Clustering distributed homogeneous datasets. In: Proc of the 4th European Conf on Principles of Data Mining and Knowledge Discovery, LNCS 1910. Berlin: SpringerVerlag, 2000. 566~574
  • 10S Parthasarathy, M Ogihara. Exploiting dataset similarity for distributed mining. In: J D P Rolim ed. Parallel and Distributed Processing, 15 IPDPS 2000 Workshops, LNCS 1800. Berlin:Springer-Verlag, 2000. 390~406

共引文献18

同被引文献29

  • 1CORMODE G,MUTHUKRISHNAN S.What's new:Finding significant differences in network data streams[C]// Proceedings of the 23rd Annual Joint Conference of the IEEE Computer and Communications Societies.Washington,DC:IEEE Computer Society,2004:1534-1545.
  • 2KLEINBERG J.Bursty and hierarchical structure in streams[J].Data Mining and Knowledge Discovery,2003,7(4):373-397.
  • 3CHEN JIE,GUPTA A K.Testing and locating variance changepoints with application to stock prices[J].Journal of the American Statistical Association,1997,92(438):739-747.
  • 4GUHA S,MCGREGOR A,VENKATASUBRAMANIAN S.Streaming and sublinear approximation of entropy and information distances[C]// Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithm.New York:ACM,2006:733-742.
  • 5BARNARD G A.Control charts and stochastic processes[J].Journal of the Royal Statistical Society,1959,21 (2):239-271.
  • 6NIKOVSKI D,JAIN A.Memory-based algorithms for abrupt change detection in sensor data streams[C]// Proceedings of the 5th IEEE International Conference on Industrial Informatics. Piscataway:IEEE,2007:547-552.
  • 7HUANG W,OMIECINSKI E,MARK L,et al.History guided lowcost change detection in streams[C]// Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery.Berlin:Springer-Verlag,2009:75-86.
  • 8GURALNIK V,SRIVASTAVA J.Event detection from time series data[C]// Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,1999:33-42.
  • 9LI Z,MA H,ZHOU Y.A unifying method for outlier and change detection from data streams[C]//Proceedings of 2006 International Conference on Computational Intelligence and Security.Berlin:Springer-Verlag,2006:580-585.
  • 10LI Z,MA H,ZHOU Y.A unifying method for outlier and change detection from data streams based on local polynomial fitting[C]//Proceedings of the 11 th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining.Berlin:Springer-Verlag,2007:150-171.

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部