期刊文献+

基于向量的数据流滑动窗口中最大频繁项集挖掘 被引量:7

Algorithm based on vector for mining maximal frequent itemsets in sliding window over data streams
下载PDF
导出
摘要 针对相关算法在挖掘数据流最大频繁项集时所存在的问题,提出了一种基于向量的数据流滑动窗口中最大频繁项集挖掘算法。该算法首先用向量作为概要数据结构,采用定量更新滑动窗口策略解决时间粒度问题;其次通过位运算产生频繁项集,利用矩阵和数组存储辅助信息,深度优先搜索产生最大频繁项集时利用剪枝策略进一步减少挖掘时间;最后用索引链表存储挖掘结果以提高超集检测效率。理论分析和实验结果验证了该算法的有效性。 This paper proposed an algorithm based on vector for mining maximal frequent itemsets in sliding window over data streams(MFISW) aimed at the mining problems of maximal frequent itemsets over data streams.Firstly,the algorithm used vector to express items in data streams and solved the problem of time granularity through quantitative updating strategies.Secondly,it stored the ancillary information using a matrice and a array in creating the frequent sets through the bit operation,and improved the mining efficiency again using pruning technology during creating the maximal frequent sets.Finally,it improved the detecting efficiency by using a index list to store mining results.Theoretical analysis and experimental results show the algorithm is efficient.
出处 《计算机应用研究》 CSCD 北大核心 2012年第3期837-840,共4页 Application Research of Computers
基金 国家"863"计划资助项目(2007AA01Z443) 成都大学校基金资助项目(2010XJZ16)
关键词 数据流 最大频繁项集 滑动窗口 向量 data stream maximal frequent itemsets sliding window vector
  • 相关文献

参考文献10

  • 1BABCOCK B,BABU S,DATAR M, et al. Models and issues in data stream systems [ C ]//Proc of the 21 st ACM SIGMOD-SIGART Sympo- sium on Principles of Database System. New York:ACM Press,2002: 1-16.
  • 2GAROFALAKIS M, GEHRKE J. Querying and mining data streams: you only get one look a tutorial[ C]//Proc of ACM SIGMOD Interna- tional Conference on Management of Data. New York: ACM Press, 2002:635.
  • 3LEE D, LEE W. Finding maximal frequent itemsets over online data streams adaptively [ C ]//Proc of the 5th IEEE International Confe- rence on Daia Mining. Washington DC : IEEE Computer Society,2005 : 266 - 273.
  • 4LI Hua-fu, LEE S, SHAN M. Online mining maximal frequent itemsets over data streams[ C]//Proc of the 15th International Workshops on Research Issues in Data Engineering: Stream Data Mining and Appli- cations. 2005 : 11 - 18.
  • 5MAO Guo-jun, WU Xin-dong, ZHU Xing-quan, et al. Mining maximal frequent itemsets from data streams[ J]. Journal of Information Sci- ence,2007,33(3 ) :251-262.
  • 6GIANNELLA C, HAN Jia-wei, PEI Jian, et al. Mining frequent pat- terns in data streams at multiple time granularities [ M ]//Next Gene- ration Data Mining. Cambridge : MIT Press ,2005 : 191 - 212.
  • 7牛小飞,石冰,卢军,吴科.挖掘关联规则的高效ABM算法[J].计算机工程,2004,30(11):118-120. 被引量:16
  • 8李可,任家东.窗口模式下在线数据流中频繁项集的挖掘[J].计算机应用研究,2010,27(5):1711-1713. 被引量:1
  • 9BORGELT C. Keeping things simple:finding frequent itemsets by re- cursive elimination [ C ]//Proc of the 1 st International Workshop on Open Source Data Mining. New York :ACM Press,2005:66-70.
  • 10AGRAWAL R, SRIKANT R. Fast algorithms for mining association rules[ C]//Proc of the 20th International Conference on Very Large Databases. San Francisco: Morgan Kaufmann Publishers, 1994:487- 499.

二级参考文献13

  • 1MANKU G S,MOTWANI R.Approximate frequency counts over data streams[C]//Proc of the 28th International Conference on Very Large Data Bases.2002:346-357.
  • 2JIANG N,GRUENWALD L.Research issues in data stream association rule mining[J].ACM SIGMOD Record,2006,35(1):14-19.
  • 3AGRAWAL R,SRIKANT R.Fast algorithms for mining association rules[C]//Proc of the 20th International Conference on Very Large Data Bases.1994:484-499.
  • 4CHANG J,LEE W.A sliding window method for finding recently frequent itemsets over online data streams[J].Journal of Information Science and Engineering,2004,20(4):175-184.
  • 5SAVERAERS A,OMIECINSKI E,NAVATHE S.An efficient algorithm for mining association rules in large databases[C]//Proc of the 21st International Conference on Very Large Data Bases.San Francisco:Morgan Kaufmann Publisher,1995:432-444.
  • 6LIN C H,CHIU D Y,WU Y H,et al.Mining frequent itemsets from data streams with a time-sensitive sliding window[C]//Proc of SIAM International Conference on Data Mining.2005.
  • 7YU J X,CHONG Z,LU H,et al.False positive or false negative:mining frequent itemsets from high speed transactional data streams[C]//Proc of the 30th International Conference on Very Large Data Bases.2004:204-215.
  • 8LI H F,LEE S Y,SHAN M K.An efficient algorithm for mining frequent itemsets over the entire history of data streams[C]//Proc of the 1st International Workshop on Knowledge Discovery in Data Streams.2004:287-291.
  • 9[1]Agrawal R, Srikant R. Fast Algorithms for Mining Association Rules. In Proceeding of the 20th International Conference on Very Large Data Bases, 1994-09: 487-499
  • 10[2]Park J S, Chen Mingsyan, Yu P S. An Effective Hash-based Algorithm for Mining Association Rules. In Proceedings of ACM SIGMOD, 1995 24(2):175-186

共引文献15

同被引文献59

引证文献7

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部