FVTreeMiner：无序频繁子树挖掘算法

FVTreeMiner：An Efficient Frequent Unordered Trees Mining Algorithm

下载PDF

导出

摘要在挖掘无序树频繁模式的过程中，大多数的算法都是先产生候选者，再进行模式匹配判断它是否为频繁子树。产生候选者本身就需要消耗很大的空间来保存，并且要在复杂的树结构里做匹配也是件难事，它会影响整个挖掘过程的效率。为了尽量避免产生不必要的候选者，提高发现频繁模式的效率，基于对相关算法的研究，引进树投影资料库的概念，并在RootedTreeaVfiner算法的基础上，采用其模式延伸方法和广度优先标准型式概念，提出子树频繁度、频繁可延冲点串的概念，从而更有效系统地枚举所有的频繁模式树，并给出无序频繁子树挖掘算法FVTreeMiner。经系列实验结果证实了该算法合理、高效，并可以减少一定的内存开销和运行时间开销。 The most frequent unordered treec, mining algorithm, always enumerate some frequent pattern cheoser,and then to check the chooser is frequent or not. This process wastes a lot of memory, and is difficult to do matching, for increasing the frequent pattern mining＇ s efficiency. Under relational research, import Tree Projected Database, and based frequent unordered trees mining algorithm RootedTreeMinier, use its canonical forms and the other concept, for more efficient mining, project the new algorithm： FVTreeMiner. FVTreeMiner advice Subtree Frequent Set and Extended Nodes List, avoid enumerating useless pattern chooser. For long time＇ s research, a great of experiments result shows that the new algorithm use less memory and time than RootedTreeMiner.

作者陈冬菊张东站段江娇

机构地区厦门大学信息科学与技术学院计算机科学系

出处《计算机技术与发展》 2010年第5期9-12,共4页 Computer Technology and Development

基金基金项目：国家自然科学基金（50604012）

关键词无序树标准型式频繁子树 unordered tree canonical forms frequent subtree

分类号 TP393 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献10

1Chi Y,Yang Y,Muntz R R.HybridTreeMiner:An Efficient Algorithm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms[C]//In proceedings of the 16th International Conference on Sdentific and Statistical Database Management(SSDBM'04).Washington:IEEE Computer Society,2004.
2Chi Y,Yang Y,Muntz R R.Canonical Forms for Labeled Trees and Their Applications in Frequent Subtree Mining[J].Journal of Knowledge and Infonmfion Systems(KAIS),2005,8(2):203-234.
3Huang K Y,Chang C H,Lin K Z.PROWL:An effident frequent continuity mining algorithm on event sequences[C]//In proceedings of 6th International Conference on Data Warehousing and Knowledge Discovery(DaWak).Washington:[s.n.],2004.
4Agrawal R,Srikaat R.Fast algorithms for mining association rules[C]//In proceedings of 1994 International Conference.Very Large Data Bases(VLDB'94).New York:[s.n.],1994:487-499.
5潘锦.Chopper:一个高效的有序标号树频繁结构的挖掘算法[C]//第20届全国数据库年会(NDBC).长沙,2003:303-308.
6杨沛,郑启伦,彭宏,李颖基.PFTM:一种基于投影的频繁子树挖掘算法[J].计算机科学,2005,32(2):206-209. 被引量：5
7王新宇,杜孝平,谢昆青.FP-growth算法的实现方法研究[J].计算机工程与应用,2004,40(9):174-176. 被引量：27
8吉根林,韦素云,鲍培明.一种基于DOM树的XML数据频繁模式挖掘算法[J].南京航空航天大学学报,2006,38(2):206-211. 被引量：4
9Zaki M J.Fast vertical mining using diffsets,TRO1·1[R].New York:Rensselaer Polytechnic Institute,2001.
10朱永泰,王晨,洪铭胜,汪卫,施伯乐.ESPM——频繁子树挖掘算法[J].计算机研究与发展,2004,41(10):1720-1727. 被引量：18

二级参考文献48

1[1]J Han,Micheline Kamber. Data Mining:Concepts and Techniques[M].Morgan Kaufmann Publishers,2001
2[2]R Agrawal,R Srikant. Fast algorithms for mining association rules[C].In: VLDB ′94,1994: 487～499
3[3]R Agrawal ,T Imielinski ,A Swami. Mining association rules between sets of items in large databases[C].In:Proc 1993 ACM-SIGMOD Int Conf Management of Data (SIGMOD′93), Washington, DC, 1993-05:207～216
4[4]J S Park ,M S Chen,P S Yu. An effective hash-based algorithm for mining association rules[C].In:SIGMOD'95,1995:175～186
5[5]J Han,J Pei,Y Yin. Mining frequent patterns without candidate generation[C].In: Proc ACM SIGMOD, 2000:1～12
6[6]C A Shaffer. Data Structures and Algorithm Analysis[M].Prentice Hall,1997
7Cook D, Holder L. Substructure discovery using minimal description length and background knowledge. Journal of Arti_cial Intelligence Research, 1994,1: 231～ 255.
8Yoshida K, Motoda H. CLIP: Concept learning from inference patterns. Artificial Intelligence, 1995,75 (1):63～ 92.
9Asai T,Abe K,Kawasoe S,Arimura H,Satamoto H,Arikawa S.Effecient substructure discovery from large semi-structured data.In:2nd SIAM Int'l. Conf. on Data Mining,April 2002.
10Zaki M J. Efficiently mining frequent trees in a forest. In SIGKDD'2002 Edmonton, Alberta, Canada.

共引文献49

1陈子军,李伟,李霞,王鑫昱.基于投影编码的频繁子树挖掘算法[J].计算机研究与发展,2006,43(z3):389-394. 被引量：2
2胡枫.频繁序列模式挖掘算法Apriori的分析及改进[J].青海师范大学学报（自然科学版）,2009,25(3):35-38. 被引量：1
3王艳辉,吴斌,王柏.频繁子图挖掘算法综述[J].计算机科学,2005,32(10):193-196. 被引量：12
4赵文文,吴坚,陈波.数据挖掘中的频繁模式发现[J].萍乡高等专科学校学报,2005,22(4):84-85.
5赵传申,孙志挥,张净.基于投影分支的快速频繁子树挖掘算法[J].计算机研究与发展,2006,43(3):456-462. 被引量：14
6孙志强.基于FP-Growth的入侵检测研究[J].计算机技术与发展,2006,16(12):233-236.
7国新出版物发行数据调查中心修改《出版物发行数据核查指引》(报刊部分)[J].中国报业,2006(12):17-17.
8何宏,肖伟平,郭潇婕.稀疏矩阵的关联规则挖掘算法研究[J].湖南工程学院学报（自然科学版）,2007,17(1):49-51.
9肖峻,张晶,朱涛,史常凯,张海平.基于关联分析的城市用电负荷研究[J].电力系统自动化,2007,31(17):103-107. 被引量：24
10罗云深,陈志泊.DOM驱动型智能体在计算任务中的研究与实现[J].计算机应用,2007,27(9):2327-2329.

1朱浩冰,郭东辉.声纹识别系统原理及其关键技术[J].计算机安全,2007(9):14-17. 被引量：15
2富小薇,傅晓晶,王志富,吕伟.航天器间信息流测试验证实践[J].计算机测量与控制,2015,23(9):3098-3100. 被引量：4
3徐贇.利用Auto CAD确定圆弧曲线的方法[J].职业技术教育,1999,20(14):28-28.
4王桥医.提高有限元前后置处理的有效系统[J].轻工机械,1996,14(4):11-12.
5张立.耦合GIS与CFD应用潜力与面临挑战的探讨[J].科技视界,2014(29):162-162. 被引量：3
6叶卿,徐建闽,林培群.基于计算机视觉的停车位车辆存在性检测方法[J].交通信息与安全,2014,32(6):39-43. 被引量：5
7病毒情报站[J].电脑采购,2002,0(45):22-22.
8丁洪起.S7-400H硬件冗余系统在高炉控制系统的应用[J].中国新技术新产品,2008(8):104-104. 被引量：2
9赵东福,杨将新.不规则域曲面的规范化重组与曲面重建[J].浙江大学学报（工学版）,2002,36(3):260-264. 被引量：2
10赵杰,剧冬花.基于校园网的身份识别和M2M平台应用设计[J].太原城市职业技术学院学报,2012(4):152-153.

计算机技术与发展

2010年第5期

浏览历史

内容加载中请稍等...

FVTreeMiner：无序频繁子树挖掘算法

参考文献10

二级参考文献48

共引文献49

相关作者

相关机构

相关主题

浏览历史