期刊文献+

基于静态IS-树的频繁模式挖掘

Mining Frequent Patterns Based on Static IS-Tree
原文传递
导出
摘要 提出一种基于静态IS-树的频繁模式挖掘有效算法IS-mine,并与经典的Apriori算法和FP-growth算法进行了实验比较。算法直接构造频繁项集,不进行Apriori算法采用的代价较高的候选集产生与测试操作。算法采用深度优先,模式增长的策略,挖掘任务只在一棵静态的IS-树上进行,避免了FP-growth算法所采用的代价较高的动态树的构建。针对不同特征数据集算法采取不同的过滤技术来缩小搜索空间。实验与理论分析表明,对于稠密和稀疏数据两类数据集,算法都具有较好的时空效率。 In this paper, an algorithm is presented for mining frequent patterns based on a static IS- tree. The algorithm builds frequent patterns directly, instead of high-cost candidate sets generation- and-test method used by Apriori. It gen approach, and works on a static IS-tree, In order to reduce search space , it characteristics of datasets. Our perform both dense datasets and sparse datasets. erates frequent rather than a c uses different ance study and patterns by depth first and pattern growth ostly dynamic tree adopted by FP-growth. filter technologies according to different theory analysis show that it is efficient in
出处 《模式识别与人工智能》 EI CSCD 北大核心 2005年第6期664-669,共6页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金(No.60173027) 国家863高技术研究发展计划基金(No.2001AA115020)
关键词 数据挖掘 频繁模式 IS-树 FP-树 Data Mining, Frequent Patterns, IS-Tree, FP-Tree
  • 相关文献

参考文献10

  • 1Agrawal R, Imielinski T, Swami A N. Mining Association Rules between Sets of Items in Large Databases. In: Proc of the ACM SIGMOD International Conference on Management of Data. Washington, USA, 1993, 207-216.
  • 2Agrawal R, Srikant R. Fast Algorithm for Mining Association Rules. In: Proc of the 20th International Conference on Very Large Data Bases. Santiago, USA, 1994, 487-499.
  • 3Park J S, Chen M S, Yu P S. An Effective Hash-Based Algorithm for Mining Association Rules. In: Proc of the ACM SIG-MOD International Conference on Management of Data. San Jose, USA, 1995, 175-186.
  • 4Toivonen H. Sampling Large Databases for Association Rules.In: Proc of the 22nd International Conference on Very Large Database. Mumbay, India, 1996, 134-145.
  • 5Han J W, Pei J Y, Yi Y W. Mining Frequent Patterns without Candidate Generation. In: Proc of the ACM SIGMOD International Conference on Management of Data. Dallas, USA,2000, 1-12.
  • 6Han J, Lu H, Nishio S, Tang S, Yang D. Hmine: Hyper Structure Mining of Frequent Patterns in Large Databases. In:Proc of the International Conference on Data Mining. San Jose USA, 2001, 441-448.
  • 7胡运发.互关联后继树—一种新型全文数据库数学模型 技术报告 TR02—031[R].上海复旦大学计算机与信息技术系,2002..
  • 8范明,王秉政.一种直接在Trans-树中挖掘频繁模式的新算法[J].计算机科学,2003,30(8):117-120. 被引量:10
  • 9曾海泉,胡勤友,周水庚,胡运发.基于互关联后继树的时序模式挖掘[J].模式识别与人工智能,2003,16(3):299-305. 被引量:4
  • 10白石磊,毛雪岷,王儒敬,熊范纶.一种快速挖掘频繁项目集算法[J].模式识别与人工智能,2003,16(4):465-469. 被引量:10

二级参考文献27

  • 1胡运发.互关联后继树—一种新型全文数据库数学模型.技术报告,CIT-02—03[R].计算机与信息技术系,复旦大学,2002..
  • 2Han J, Kamber M. Data Mining: Concepts & Techniques.Boston: Morgan Kaufmann Publishers, 2001.
  • 3Mannila H, Toivonen H. Discovering Generalized Episodes Using Minimal Occurrences. In: Proc of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD'96). Portland Oregon: AAAI Press, 1996, 146-151.
  • 4Das G, Lin K, Mannila H, Renganathan G. Smyth P. Rule Discovery from Time Series. In: Proc of the 4th International Conference on Knowledge Discovery and Data Mining, New York, NY, 1998, 16-22.
  • 5Hoppner F. L.eaming Temporal Rules from State Sequences. In: Proc of the 17th International Joint Conference on Artificial Intelligence, Seattle, Washington, USA, 2001, 183- 190.
  • 6Keogh E J, Pazzani M J. Relevance Feedback Retrieval of Time Seties. In: Proc of the 22th International Conference on Research and Development in Information Retrieval. San Francisco, CA, USA, 1999, 89-95.
  • 7Keogh E J, Smyth P. A Probabilistic Approach to Fast Pattern Matching in Time Series Databases. In: Proc of the 3rd International Conference on Knowledge Discovery and Data Mining.Newport Beach, CA, 1997, 24- 30.
  • 8Park, S, Kim S W, Chu W W. Segment-Based Approach for Subsequence Searches in Sequence Databases. In: Proc of the 16th ACM Symposium on Applied Computing, Las Vegas, NV, 2001, 248 - 252.
  • 9Agrawal R, Imielinski T, Swami A. Mining Association Rules Sets of Items in Large Databases. In: Proc of the ACM SIGMOD Conference on Management of Data (SIGMOD' 93 ), Washington, DC, USA, 1993, 207- 216.
  • 10Agrawal R, Srikant R. Fast algorithms for Mining association rules. In:Proc 1994 Int'l Conf on Very Large Data Bases,Sept.1994- 487-499.

共引文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部