基于投影编码的频繁子树挖掘算法被引量：2

An Algorithm of Mining Frequent Subtrees Based on Projection and Encoding

下载PDF

导出

摘要频繁子树挖掘被广泛地应用于Web挖掘、生物信息学、XML数据挖掘等领域.提出一种新的算法--PETreeMiner.算法利用序列中无候选产生的技术--前缀投影技术来挖掘频繁子树.在树的先序遍历序列中加入结点的范围属性,在投影过程中进行编码,使得挖掘到的频繁子序列直接对应成一棵频繁子树.实验结果表明算法优于其他算法.

作者陈子军李伟李霞王鑫昱

机构地区燕山大学信息学院计算机科学与工程系

出处《计算机研究与发展》 EI CSCD 北大核心 2006年第z3期389-394,共6页 Journal of Computer Research and Development

基金燕山大学博士基金项目(B83)

关键词数据挖掘频繁子树前缀投影编码

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献10

1[1]M J Zaki.Efficiently mining frequent trees in a forest.The 8th Int'l Conf on Knowledge Discovery and Data Mining (SIGKDD),Edmonton,Canada,2002
2[2]M J Zaki.Efficiently mining frequent embedded unordered trees.Fundamental Informaticae,2005,66(1-2):33-52
3[3]T Asai,K Abe,S Kawasoe,et al.Efficient substructure discovery from large semi-structured data.The 2nd SIAM Int'l Conf on Data Mining,Arlington,USA,2002
4[4]T Asai,H Arimura,T Uno,et al.Discovering frequent substructures in large unordered trees.The 6th Int'l Conf on Discovery Science,Sapporo,Japan,2003
5[5]J Han,等.数据挖掘:概念与技术.北京:机械工业出版社,2001
6[6]J Han,J Pei.FreeSpan:Frequent pattern-projected sequential mining.The 6th Int'l Conf on Knowledge Discovery and Data Mining(SIGKDD),Boston,USA,2000
7[7]J Pei,J Han.PrefixSpan:Mining sequential patterns by prefix projected growth.The 17th Int'l Conf on Data Engineering,Heidelberg,Germany,2001
8朱永泰,王晨,洪铭胜,汪卫,施伯乐.ESPM——频繁子树挖掘算法[J].计算机研究与发展,2004,41(10):1720-1727. 被引量：18
9[10]Y Chi.Frequent Subtree mining--An overview.Fundamental.Informaticae,2005,66(1-2):161-198
10[11]Christie I Ezeife,Yi Lu.Mining Web log sequential patterns with position coded pre-order linked WAP-tree.Data Mining and Knowledge Discovery,2005,10(1):5-38

二级参考文献20

1R Agarwal, et al. A tree projection algorithm for generation of frequent item sets. Journal of Parallel and Distributed Computing,2001, 61(3): 350～371
2R Agrawal, et al. Fast algorithms for mining association rules in large databases. The 20th Int'l Conf on Very Large Data Bases,Santiago de Chile, hile, 1994
3J Han, J Pei, et al. Mining frequent patterns without candidate generation. The ACM-SIGMOD Int'l Conf on Management of Data, Dallas, Texas, USA, 2000
4R Agrawal, et al. Mining sequential pattem. The 1 1th Int' l Conf on Data Engineering, Taipei, Taiwan, 1995
5J Ayres, et al. Sequential pattern mining using a bitmap representation. The 8th ACM SIGKDD Int 'l Conf on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, 2002
6J Pei, et al. PreffixSpan: Mining sequential patterns by preffixprojected growth. The 17th Int'l Conf on Data Engineering,Heidelberg, Germany, 2001
7M Zaki. SPADE: An effcient algorithm for mining frequent sequences. Machine Learning, 2001, 42(1/2): 31～60
8T Asai, K Abe, et al. Efficient substructure discovery from large semi-structured data. The 2nd SIAM Int'l Conf on Data Mining,Arlington, VA, USA, 2002
9M Kuramochi, et al. Frequent subgraph discovery. The IEEE Int'l Conf on Data Mining, San Jose, California, USA, 2001
10M J Zaki. Efficiently mining frequent trees in a forest. The 8th ACM SIGKDD Int'l Conf on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, 2002

共引文献17

1胡枫.频繁序列模式挖掘算法Apriori的分析及改进[J].青海师范大学学报（自然科学版）,2009,25(3):35-38. 被引量：1
2赵文文,吴坚,陈波.数据挖掘中的频繁模式发现[J].萍乡高等专科学校学报,2005,22(4):84-85.
3赵传申,孙志挥,张净.基于投影分支的快速频繁子树挖掘算法[J].计算机研究与发展,2006,43(3):456-462. 被引量：14
4国新出版物发行数据调查中心修改《出版物发行数据核查指引》(报刊部分)[J].中国报业,2006(12):17-17.
5朱颖雯,吉根林.一种高效的最大频繁Embedded子树挖掘算法[J].计算机科学,2007,34(12):175-179. 被引量：1
6王涛.一种基于频繁子树的数据库索引方法[J].华中科技大学学报（自然科学版）,2008,36(3):103-106. 被引量：1
7周军,姜元春,林文龙.基于有向带权图的Web用户浏览行为模型[J].情报理论与实践,2008,31(5):795-798. 被引量：1
8孔鹏程,张继福.基于离散区间的频繁嵌入式子树挖掘算法[J].计算机应用,2009,29(4):1120-1123.
9贝毅君,陈刚,董金祥.面向Web活跃用户的树型访问模式挖掘算法[J].浙江大学学报（工学版）,2009,43(6):1005-1013.
10郭鑫,李云,黄云,周清平.最小闭树特征集的聚类与分类方法[J].计算机应用,2010,30(2):423-426. 被引量：5

同被引文献15

1朱永泰,王晨,洪铭胜,汪卫,施伯乐.ESPM——频繁子树挖掘算法[J].计算机研究与发展,2004,41(10):1720-1727. 被引量：18
2杨沛,郑启伦,彭宏,李颖基.PFTM:一种基于投影的频繁子树挖掘算法[J].计算机科学,2005,32(2):206-209. 被引量：5
3赵传申,孙志挥,张净.基于投影分支的快速频繁子树挖掘算法[J].计算机研究与发展,2006,43(3):456-462. 被引量：14
4马海兵,王兰成.高效挖掘无序频繁子树[J].小型微型计算机系统,2006,27(11):2104-2108. 被引量：6
5Pei Jian, Hart Jiawei, Mortazavi-Asl B, et al.PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth[C]// Proceedings of ICDE, 2001 : 215-224.
6Inokuchi A, Washin T, Motoda H.An apriori-based algorithm for mining frequent substructures from graph data[C]//Proceedings of the 2000 Europe Conference on Principle of Data Mining and Knowledge Discovery (PKDD' 00), 2000.
7Srivastava J, Cooley R.Web usage mining:discovery and applications of usage patterns from Web data[J].ACMSIGKDD Explora- tions Newsletter,2000,1 (2) : 12-23.
8Shasha D,Wang J T L,Zhang Sen.Unordered tree mining with applications to phylogeny[C]//Proceedings of ICDE,2004:708-719.
9Zaki T M J.Efficiently mining frequent trees in a forest[C]//Pro- ceedings of the 8th ACM SIGKDD on Knowledge Discovery and Data Mining,2002:71-80.
10Asai T, Abe K, Kawasoe S, et al.Efficient substructure discovery from large semistructured data[J].IEICE Transactions on Information and Systems,2004,87 (12) : 2754-2763.

引证文献2

1李娟,杨珺.基于分区的频繁子树挖掘算法研究[J].计算机工程与设计,2011,32(6):2054-2057.
2尹四清,孔鹏程,张素兰.利用编码的频繁导出式子树挖掘算法[J].计算机工程与应用,2011,47(24):121-124.

1师鸣若.一种网络流量的序列模式挖掘方法[J].微计算机信息,2011,27(3):230-232.
2袁园.基于多层次技术的XML数据挖掘研究[J].信息通信,2016,29(1):143-144. 被引量：1
3朱兴统,许波.一种基于粗糙集理论的XML数据挖掘模型[J].科学技术与工程,2011,11(20):4898-4902.
4秦兆文,刘嘉勇.基于PrefixSpan的应用层协议特征串提取算法[J].信息安全与通信保密,2014,12(6):105-108. 被引量：1
5张巍,刘峰,滕少华.改进的PrefixSpan算法及其在序列模式挖掘中的应用[J].广东工业大学学报,2013,30(4):49-54. 被引量：11
6郭鑫,骆期裕,徐洪智.频繁子树挖掘算法综述[J].软件导刊,2009,8(12):49-51.
7李彬,何静,张岩.管理信息系统的数据库设计[J].光盘技术,2008(1):24-26. 被引量：5
8万洪莉.SOAP消息的非递归先序解析算法研究[J].软件工程师,2009(11):52-53.
9方少卿,胡学钢.基于Web挖掘的信息抽取系统的研究[J].铜陵学院学报,2010,9(4):66-68.
10刘骞,陈明.基于Map/Reduce集群上的模式空间划分的序列模式挖掘[J].微电子学与计算机,2012,29(9):149-151. 被引量：1

计算机研究与发展

2006年第z3期

浏览历史

内容加载中请稍等...

基于投影编码的频繁子树挖掘算法被引量：2

参考文献10

二级参考文献20

共引文献17

同被引文献15

引证文献2

相关作者

相关机构

相关主题

浏览历史

基于投影编码的频繁子树挖掘算法 被引量：2

参考文献10

二级参考文献20

共引文献17

同被引文献15

引证文献2

相关作者

相关机构

相关主题

浏览历史

基于投影编码的频繁子树挖掘算法被引量：2