期刊文献+

数据流最大频繁项挖掘方法 被引量:2

Mining Method of Data Stream Maximum Frequent Itemsets
下载PDF
导出
摘要 提出基于事务矩阵挖掘最大频繁项集的方法AFMI,该方法采取迭代精简事务矩阵的方式求解所有事务中的最大频繁项集,从精简后的事务向量交集的子集中搜索最大频繁项集,并运用逻辑运算和剪枝方法提高挖掘效率。基于AFMI方法,研究挖掘滑动窗口数据流最大频繁项集算法AFMI+,该算法可使用户周期性地挖掘当前窗口中的最大频繁项集。实验结果表明,AFMI和AFMI+算法均具有较好的性能。 A method called AFMI based on a transaction matrix is proposed to mine the maximum frequent itemsets. The frequent itemsets are obtained from all the transactions by means of condensing iteratively the transaction matrix, the transaction vector intersections are acquired to reduce the range of search. Logical operations and pruning methods are adopted to improve the efficiency of the mining. Based on AFMI, an algorithm called AFMI+ is proposed, which can mine maximum frequent itemsets from a sliding window over data streams. AFMI+ can get the maximum frequent itemsets in current sliding window over data streams just when users need to get them periodically. Experimental result shows that AFMI and AFMI+ algorithms have better performance.
作者 张月琴 陈东
出处 《计算机工程》 CAS CSCD 北大核心 2010年第22期86-87,90,共3页 Computer Engineering
基金 南京工业大学青年教师学术基金资助项目(39709013)
关键词 数据挖掘 数据流 滑动窗口 最大频繁项集 矩阵 data mining data stream sliding window maximum frequent itemsets matrix
  • 相关文献

参考文献6

  • 1Grahne G, Zhu J E High Performance Mining of Maximal Frequent Itemsets[C]//Proc. of the 6th SIAM Int'l Workshop on High Performance Data Mining. San Francisco, USA: [s. n.], 2003: 135-143.
  • 2张忠平,郑为夷.基于事务树的最大频繁项集挖掘算法[J].计算机工程,2009,35(15):97-99. 被引量:7
  • 3陈波,王乐,董鹏.挖掘最大频繁项集的事务集迭代算法[J].计算机工程与应用,2009,45(6):141-144. 被引量:3
  • 4Li H, Lee S, Shan M. Online Mining(Recently) Maximal Frequent Itemsets over Data Streams[C]//Proc. of the 15th International Workshops on Research Issues in Data Engineering: Stream Data Mining and Applications. Tokyo, Japan: [s. n.], 2005: 11-18.
  • 5Lee D, Lee W. Finding Maximal Frequent Itemsets over Online Data Streams Adaptively[C]//Proc. of the 5th IEEE International Conference on Data Mining. Houston, USA: IEEE Press, 2005: 266-273.
  • 6敖富江,颜跃进,刘宝宏,黄柯棣.在线挖掘数据流滑动窗口中最大频繁项集[J].系统仿真学报,2009,21(4):1134-1139. 被引量:9

二级参考文献27

  • 1李庆华,王卉,蒋盛益.挖掘最大频繁项集的并行算法[J].计算机科学,2004,31(12):132-134. 被引量:5
  • 2胡斌,蒋外文,蔡国民,黄天强,卓月明.基于位阵的更新最大频繁项集算法[J].计算机工程,2007,33(3):59-61. 被引量:4
  • 3Ceglar A,Roddick J F.Association mining[J].ACM Computing Surveys, 2006,38(2) : 1-42.
  • 4Rigoutsos L,Floratos A.Combinatoriat pattern discovery in bio-logical sequences:the teiresias algorithm[J].Bioinformaties, 1998,14( 1 ) : 55-67.
  • 5Bayardo R J.Efficiently mining long patterns from databases[C]// Haas L M,Tiwary A.Proceedings ACM SIGMOD International Conference on Management of Data, 1998:85-93.
  • 6Lin D I,Kedem Z M.Pincer-search:a new algorithm for discovering the maximum frequent set[C]//Schek H J.Proceedings of 6th International Conference on Extending Database Technology,1998: 105-119.
  • 7Agarwal R C,Aggarwal C C,Prasad V V V.Depth first generation of long patterns[C]//Ramakrishnan R,Stolfo S.Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2000:108-118.
  • 8Burdick D,Calimlim M,Gehrke J.MAFIA:a maximal frequent itemset 'algorithm for transactional databases[C]//Georgakopoulos D. Proceedings of the 17th International Conference on Data Engineering, 2001 : 443-452.
  • 9Gouda K,Zaki M J.Efficiently mining maximal frequent itemsets[C]// Cercone N,Lin T Y,Wu X D.Proceedings of the 2001 IEEE International Conference on Data Mining,2001:163-170.
  • 10B Babcock, S Babu, M Datar, R Motwani, J Widom. Models and Issues in Data Stream Systems [C]// Proc. of PODS'2002. USA: ACM, 2002: 1-16.

共引文献16

同被引文献24

  • 1吉根林,杨明,宋余庆,孙志挥.最大频繁项目集的快速更新[J].计算机学报,2005,28(1):128-135. 被引量:47
  • 2颜跃进,李舟军,陈火旺.一种挖掘最大频繁项集的深度优先算法[J].计算机研究与发展,2005,42(3):462-467. 被引量:20
  • 3潘云鹤,王金龙,徐从富.数据流频繁模式挖掘研究进展[J].自动化学报,2006,32(4):594-602. 被引量:34
  • 4韩家炜.数据挖掘概念与技术[M].北京:机械工业出版社,2012.
  • 5Manku G S, Motwani R.Approximate frequency counts over data streams[C]//Proceeding of the 28th International Conference on VLDB,Hong Kong,2002.
  • 6Giannella C, Han J, Pei J.Mining frequent patterns in data streams at multiple time granularities[C]//Proceeding of the NSF Workshop on Next Generation Data Mining, 2002: 191-212.
  • 7Cheng J, Ke Y, Ng W.Maintaining frequent itemsets over high-speed data streams[C]//Proceeding of the 10th PAKDD, 2006.
  • 8Lee Daesu, Lee Wonsuk.Finding maximal frequent itemsets over online data streams adaptively[C]//Proc of Fifth IEEE International Conference on Data Mining.Washington DC : IEEE Computer Society, 2005: 266-273.
  • 9Mao Guojun, Wu Xindong, Zhu Xingquan.Mining maximal frequent itemsets from data streams[J].Joumal of Infor- mation Science, 2007,33 ( 3 ) : 251-262.
  • 10Li Hua-Fu,Lee Suh-Yin, Shan Man-Kwan.Online mining (recently)maximal frequent itemset over data streams[C]// Proc of the 15th International Workshops on Research Issues in Data Engineering: Stream Data Mining and Application, 2005 : 11-18.

引证文献2

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部