期刊文献+

基于兴趣度的数据流频繁模式散列挖掘算法 被引量:4

Mining approximate frequency itemsets over data streams based on hash and interesting degree
原文传递
导出
摘要 频繁模式挖掘是很多数据流挖掘工作的基础.现有算法虽然能够有效的在数据流中挖掘近似的频繁模式,但是由于数据流数据的不确定性、连续性以及海量性,始终不能有效的将算法的时间效率和空间效率控制在一个可以接受的范围内.本文通过使用散列表作为概要数据的存储结构,并引入关联规则兴趣度的概念,提出了数据流频繁模式挖掘算法MIFS-HT(mining interesting frequent itemsets with hash table),不仅有效降低现有算法的时空复杂度,同时提高了算法的应用价值.最后,实验结果表明:MIFS-HT是一种高效的数据流频繁模式挖掘算法,其性能优于FPStream、LossyCounting等算法,并且挖掘结果更具有现实意义. Frequent itemsets mining, which is the basic in the field of data stream mining, has been paid more and more attention by researchers. Due to the uncertainties, continuities and large amount of data streams, many mining algorithms are difficult to deal with these dynamic data streams. In this paper, hashed table and the interesting degree of association rules are introduced, where the former is used to represent the synoptic data structure and the latter is applied to incorporate attention of customers. After that, a new frequent itemsets mining algorithm named MIFS-HT(mining interesting frequent itemsets with hash table) is proposed. Comparing with lossy counting and a similar algorithm called mining frequent item sets over data streams by matrix (MISM for short), the result shows that MIFS-HT is more effective both in time and space efficiency.
出处 《系统工程理论与实践》 EI CSSCI CSCD 北大核心 2012年第12期2764-2773,共10页 Systems Engineering-Theory & Practice
基金 国家自然科学基金(71071141) 高等学校博士学科点专项科研基金(20103326110001) 浙江省自然科学基金重点项目(Z1091224) 浙江工商大学现代商贸中心(11JDSM02Z)
关键词 数据流 频繁模式 兴趣度 MIFS—HT data stream frequent itemset degree of interesting MIFS-HT
  • 相关文献

参考文献5

二级参考文献28

  • 1铁治欣.数据采掘技术,浙江大学博士生讨论班报告[M].,1998..
  • 2左万利 刘居正.包含正负属性的关联规则及其挖掘.第十六届全国数据库学术会议论文集[M].兰州,1999.288-292.
  • 31,Agrawal R, Mannila H, Srikant R et al. Fast discovery of association rules. In: Fayyad M, Piatetsky-Shapiro G, Smyth P eds. Advances in Knowledge Discovery and Data Mining. Menlo Park, California: AAAI/MIT Press, 1996. 307-328
  • 42,Brin S, Motwani R, Ullman J D et al. Dynamic itemset counting and implication rules for market basket data. In: Proc the ACM SIGMOD International Conference on Management of Data, Tucson, Arizon, 1997. 255-264
  • 53,Fayyad U M, Piatesky-shapiro G, Smyth P P. From data mining to knowledge discovery: an overview. In: Fayyad M, Piatetsky-Shapiro G, Smyth P eds. Advances in Knowledge Discovery and Data Mining. California:AAAI Press, 1996. 1-36
  • 64,Piatesket-Shapiro G. Discovery, analysis, and presentation of strong rules. In: Piatesky-Shapiro G, Frawley W J eds. Advances in Knowledge Discovery and Data Mining. Menlo Park, California:AAAI/MIT Press, 1991. 229-238
  • 75,Silberschatz A, Stonebraker M, Ullman J. What makes patterns interesting in knowledge discovery sysstems. IEEE Trans on Knowledge and Data Engineering, 1996, 8(6):970-974
  • 86,Symth P, Goodman R M. An information theoretic approach to rule induction from databases. IEEE Trans on Knowledge and Data Engineering, 1992, 4(4):301-316
  • 97,Toivonen H, Klemettinen M, Ronkainen P et al. Pruning and grouping discovered association rules. In: Mlnet Workshop on Statistics, Machine Learning, and Discovery in Database, Gete, Greece, 1995. 47-52
  • 10铁治欣,浙江大学博士生讨论班报告,1998年

共引文献241

同被引文献28

  • 1王崇,刘健.网络消费者购买服装类商品的决策心理研究[J].大连理工大学学报(社会科学版),2011,32(1):14-18. 被引量:4
  • 2耿素云.集合论与图论[M].北京:北京大学出版社,1997.
  • 3Han J, Kamber M. Data mining: Concepts and techniques[M]. San Mateo: Morgan Kaufmann, 2000.
  • 4Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases[C]// Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, New York, USA, 1993, 22(2): 207-216.
  • 5Brin S, Motwani R, Silverstein C. Beyond market baskets: Generalizing association rules to correlations[C]// Proceedings of the 1997 CM SIGMOD International Conference on Management of Data, New York, USA, 1997, 26(2): 265-276.
  • 6Tsur D, Ullman J D, Abiteboul S, et al. Query flocks: A generalization of association-rule mining[C]// Pro- ceedings of the 1998 ACM SIGMOD International Conference on Management of Data, New York, USA, 1998, 27(2): 1 -12.
  • 7Agrawal R, Srikant R. Fast algorithms for mining association rules[C]// Proceedings of the 20th International Conference on Very Large Data Bases, San Francisco, USA, 1994: 487-499.
  • 8Park 3 S, Chen M S, Yu P S. An effective hash based algorithm for mining association rules[C]// Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, New York, USA, 1995, 24(2): 175-186.
  • 9Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation[C]// Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, New York, USA, 2000, 29(2): 1 -121.
  • 10Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation: A frequent pattern tree approach[J]. Data Mining and Knowledge Discovery, 2004, 8(1): 53-87.

引证文献4

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部