基于静态IS-树的频繁模式挖掘

Mining Frequent Patterns Based on Static IS-Tree

导出

摘要提出一种基于静态IS-树的频繁模式挖掘有效算法IS-mine,并与经典的Apriori算法和FP-growth算法进行了实验比较。算法直接构造频繁项集,不进行Apriori算法采用的代价较高的候选集产生与测试操作。算法采用深度优先,模式增长的策略,挖掘任务只在一棵静态的IS-树上进行,避免了FP-growth算法所采用的代价较高的动态树的构建。针对不同特征数据集算法采取不同的过滤技术来缩小搜索空间。实验与理论分析表明,对于稠密和稀疏数据两类数据集,算法都具有较好的时空效率。 In this paper, an algorithm is presented for mining frequent patterns based on a static IS- tree. The algorithm builds frequent patterns directly, instead of high-cost candidate sets generation- and-test method used by Apriori. It gen approach, and works on a static IS-tree, In order to reduce search space , it characteristics of datasets. Our perform both dense datasets and sparse datasets. erates frequent rather than a c uses different ance study and patterns by depth first and pattern growth ostly dynamic tree adopted by FP-growth. filter technologies according to different theory analysis show that it is efficient in

作者马海兵张锦范颖杰胡运发

机构地区复旦大学信息科学与工程学院计算机与信息技术系复旦大学信息科学与工程学院计算机与信息技术系

出处《模式识别与人工智能》 EI CSCD 北大核心 2005年第6期664-669,共6页 Pattern Recognition and Artificial Intelligence

基金国家自然科学基金(No.60173027) 国家863高技术研究发展计划基金(No.2001AA115020)

关键词数据挖掘频繁模式 IS-树 FP-树 Data Mining, Frequent Patterns, IS-Tree, FP-Tree

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献10

1Agrawal R, Imielinski T, Swami A N. Mining Association Rules between Sets of Items in Large Databases. In: Proc of the ACM SIGMOD International Conference on Management of Data. Washington, USA, 1993, 207-216.
2Agrawal R, Srikant R. Fast Algorithm for Mining Association Rules. In: Proc of the 20th International Conference on Very Large Data Bases. Santiago, USA, 1994, 487-499.
3Park J S, Chen M S, Yu P S. An Effective Hash-Based Algorithm for Mining Association Rules. In: Proc of the ACM SIG-MOD International Conference on Management of Data. San Jose, USA, 1995, 175-186.
4Toivonen H. Sampling Large Databases for Association Rules.In: Proc of the 22nd International Conference on Very Large Database. Mumbay, India, 1996, 134-145.
5Han J W, Pei J Y, Yi Y W. Mining Frequent Patterns without Candidate Generation. In: Proc of the ACM SIGMOD International Conference on Management of Data. Dallas, USA,2000, 1-12.
6Han J, Lu H, Nishio S, Tang S, Yang D. Hmine: Hyper Structure Mining of Frequent Patterns in Large Databases. In:Proc of the International Conference on Data Mining. San Jose USA, 2001, 441-448.
7胡运发.互关联后继树—一种新型全文数据库数学模型技术报告 TR02—031[R].上海复旦大学计算机与信息技术系,2002..
8范明,王秉政.一种直接在Trans-树中挖掘频繁模式的新算法[J].计算机科学,2003,30(8):117-120. 被引量：10
9曾海泉,胡勤友,周水庚,胡运发.基于互关联后继树的时序模式挖掘[J].模式识别与人工智能,2003,16(3):299-305. 被引量：4
10白石磊,毛雪岷,王儒敬,熊范纶.一种快速挖掘频繁项目集算法[J].模式识别与人工智能,2003,16(4):465-469. 被引量：10

二级参考文献27

1胡运发.互关联后继树—一种新型全文数据库数学模型．技术报告，CIT-02—03[R].计算机与信息技术系,复旦大学,2002..
2Han J, Kamber M. Data Mining: Concepts & Techniques.Boston: Morgan Kaufmann Publishers, 2001.
3Mannila H, Toivonen H. Discovering Generalized Episodes Using Minimal Occurrences. In: Proc of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD'96). Portland Oregon: AAAI Press, 1996, 146-151.
4Das G, Lin K, Mannila H, Renganathan G. Smyth P. Rule Discovery from Time Series. In: Proc of the 4th International Conference on Knowledge Discovery and Data Mining, New York, NY, 1998, 16-22.
5Hoppner F. L.eaming Temporal Rules from State Sequences. In: Proc of the 17th International Joint Conference on Artificial Intelligence, Seattle, Washington, USA, 2001, 183- 190.
6Keogh E J, Pazzani M J. Relevance Feedback Retrieval of Time Seties. In: Proc of the 22th International Conference on Research and Development in Information Retrieval. San Francisco, CA, USA, 1999, 89-95.
7Keogh E J, Smyth P. A Probabilistic Approach to Fast Pattern Matching in Time Series Databases. In: Proc of the 3rd International Conference on Knowledge Discovery and Data Mining.Newport Beach, CA, 1997, 24- 30.
8Park, S, Kim S W, Chu W W. Segment-Based Approach for Subsequence Searches in Sequence Databases. In: Proc of the 16th ACM Symposium on Applied Computing, Las Vegas, NV, 2001, 248 - 252.
9Agrawal R, Imielinski T, Swami A. Mining Association Rules Sets of Items in Large Databases. In: Proc of the ACM SIGMOD Conference on Management of Data (SIGMOD' 93 ), Washington, DC, USA, 1993, 207- 216.
10Agrawal R, Srikant R. Fast algorithms for Mining association rules. In:Proc 1994 Int'l Conf on Very Large Data Bases,Sept.1994- 487-499.

共引文献21

1刘林东,印鉴.Web挖掘在考试系统中应用[J].计算机应用研究,2005,22(2):150-151. 被引量：13
2肖基毅,邹腊梅,刘丰.频繁项集挖掘算法研究[J].情报杂志,2005,24(11):2-3. 被引量：2
3刘翠娟,王保义,秦艳凯.基于项集特性的关联规则挖掘中Apriori算法的改进[J].山西电子技术,2005(6):20-22. 被引量：1
4郭维,欧阳一鸣,郭骏.Web挖掘中基于交集算法发现用户频繁访问模式[J].合肥工业大学学报（自然科学版）,2006,29(12):1511-1515.
5王春凯,李睿楠,范明.挖掘正相关的频繁项集[J].计算机应用,2007,27(1):108-110.
6欧阳一鸣,郭维,郭骏,孙超超.Web挖掘中基于GITC算法发现用户频繁访问模式[J].计算机工程与应用,2007,43(7):191-194. 被引量：1
7林丽,冯少荣,薛永生.基于有限个条件FP_树中挖掘频繁模式[J].计算机工程与应用,2007,43(5):175-177.
8孙莉.数据库和数据流频繁项集挖掘算法研究[J].现代机械,2007(5):54-57.
9冯文超,吴绍春,王炜.基于IRST的并行时序模式挖掘算法[J].计算机应用研究,2007,24(12):137-140. 被引量：3
10郭维.Web日志挖掘中GITC算法的改进[J].计算机工程,2008,34(4):60-62. 被引量：3

1胡俊.基于FP-树的关联规则挖掘算法浅谈[J].硅谷,2010,3(21):175-175. 被引量：1
2邓有莲,周定康.基于双链项头表的FP-growth改进算法[J].计算机与现代化,2007(4):58-61.
3王艳.数据挖掘中关联规则的探讨[J].成都信息工程学院学报,2004,19(2):172-176. 被引量：18
4杜鹃,马莉.信息论在数据挖掘领域中的应用[J].电脑知识与技术（过刊）,2010,0(35):9934-9936. 被引量：1
5王春凯,李睿楠,范明.挖掘正相关的频繁项集[J].计算机应用,2007,27(1):108-110.
6罗可,林睦纲,郗东妹.数据挖掘中分类算法综述[J].计算机工程,2005,31(1):3-5. 被引量：63
7曾艳,麦永浩.一种高效的频繁模式挖掘算法[J].计算机应用,2004,24(8):57-60. 被引量：1
8赵忠孝.数据库中的动态树的建立与管理[J].电脑开发与应用,1994,7(3):56-60.
9曾艳,麦永浩.基于用户评分的关联规则挖掘协同推荐[J].计算机工程,2005,31(15):87-89. 被引量：3
10陈语林,梁建武.在动态树中寻找祖先[J].计算机工程与应用,2001,37(21):138-140.

模式识别与人工智能

2005年第6期

浏览历史

内容加载中请稍等...

基于静态IS-树的频繁模式挖掘

参考文献10

二级参考文献27

共引文献21

相关作者

相关机构

相关主题

浏览历史