NPSP:一种高效的序列模式增量挖掘算法被引量：4

NPSP:AN EFFICIENT ALGORITHM WITH INCREMENTAL DATA MINING FOR MINING SEQUENTIAL PATTERNS

下载PDF

导出

摘要提出了一种称为"异构树"的数据结构,采用一套编号规则对异构树的分支进行编号,使具有相同编号的分支代表相同的候选序列,编号不同的分支代表不同的候选序列,极大地简化了候选集计数过程.在此基础上提出了具有增量挖掘功能的序列模式高效挖掘算法NPSP,并从理论分析和实验两方面证明了其挖掘结果集的完备性和算法的高效性. The GSP and the PSP are the main two algorithms for mining sequential patterns.But neither of those algorithms has the function of incremental data mining and their efficiency is lower.In this paper,a data structure called Heterogeneity Tree is presented and a set of rules is used to number the branches of the Heterogeneity Tree.The rules ensure that the branches which have the same serial numbers represent the same candidates and the branches which have different serial numbers represent different candidates so that the process of counting the support of candidates is simplified.Based on those,an efficient algorithm with the function of incremental data mining for mining sequential patterns is obtained.Finally the completeness of the mined set and efficiency of the algorithm NPSP by theories and experiment are proved.

作者张兵聂永红林士敏

机构地区江苏行政学院现代科技部广西工学院计算机工程系广西师范大学数学与计算机科学学院

出处《广西师范大学学报（自然科学版）》 CAS 2004年第4期22-26,共5页 Journal of Guangxi Normal University:Natural Science Edition

基金澳大利亚ARC基金资助项目(DP0343109)

关键词数据挖掘序列模式 NPSP算法增量挖掘 data mining sequence patterns NPSP algorithm incremental data mining

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献6

1Agrawal Rakesh,Srikant Ramakrishnan.Mining sequential patterns[A].Proceedings of the 11th international conference on data engineering[C].Los Alamitos,CA:IEEE Computer Society Press,1995.3-14.
2Srikant Ramakrishnan,Agrawal Rakesh.Mining sequential patterns:generalizations and performance improvements[A].Proceedings of the 5th international conference on extending database technology[C].Berlin:Springer-Verlag,1996.3-17.
3Masseglia F,Cathala F,Poncelet P.The PSP approach for mining sequential patterns[A].Proceedings of the 2nd European symposium on principles of data mining and knowledge discovery[C].Berlin:Springer-Verlag,1998.176-184.
4Mueller A.Fast sequential and parallel algorithms for association rule mining:a comparison(technical report CS-TR-3515)[R].College Park:University of Maryland,1995.
5苏毅娟,严小卫.一种改进的频繁集挖掘方法[J].广西师范大学学报（自然科学版）,2001,19(3):22-26. 被引量：10
6Agrawal R,Srikant R.Fast algorithms for mining association rules in large databases[A].Proceedings of the 20th international conference on very large databases[C].San Mateo:Morgan Kaufmann Publishers,1994.487-499.

二级参考文献7

1Cheung D,Vincent T.Efficient mining of association rules in distributed databases[J].IEEE Transactions on Knowledge and Data Engineering,1996,8(6):911-922.
2Agrawal R,Imielinski T,Swamy A.Mining association rules between sets of items in large databases[A].Proceedings of ACM SIGMOD International conference on Management of Data[C].Washington:Springer-Verlag,1993.458-466.
3Li Shen,Hong Shen,Ling Cheng.New algorithms for efficient mining of association rules[J].Information Sciences,1999,118(4):251-268.
4Bing Liu,Wynne Hsu,Lai-Fun Mun,Hing-Yan Lee.Finding interesting patterns using user expections[J].IEEE Transactions on Knowledge and Data Engineering,1999,11(6):817-832.
5Chen M,Han J,Yu P S.Data Mining:An overview from database perspective[J].IEEE Transactions on Knowledge and Data Engineering,1996,8(6):866-883.
6陆丽娜,陈亚萍,魏恒义,杨麦顺.挖掘关联规则中Apriori算法的研究[J].小型微型计算机系统,2000,21(9):940-943. 被引量：140
7马献明,严小卫,陈宏朝.个性化网上信息代理技术的研究概述[J].广西师范大学学报（自然科学版）,2000,18(3):40-44. 被引量：19

共引文献9

1牛力.数据挖掘中的统计分析技术应用研究[J].广西师范大学学报（哲学社会科学版）,2002,38(S1):226-229. 被引量：6
2卢景丽,徐章艳,刘美玲,区玉明.一种改进的负关联规则挖掘算法[J].广西师范大学学报（自然科学版）,2004,22(2):41-46. 被引量：8
3尹云飞,区玉明,张师超,黄红兵.双重区间值聚类挖掘模型[J].广西师范大学学报（自然科学版）,2004,22(3):15-18. 被引量：3
4张兵.一种网络日志挖掘的高效算法[J].广西师范大学学报（自然科学版）,2006,24(1):26-29. 被引量：2
5孔德剑.关联规则挖掘Apriori算法效率提高方法研究[J].中国科技信息,2011(23):85-85.
6唐懿芳,牛力,张师超.多数据源关联规则挖掘算法研究[J].广西师范大学学报（自然科学版）,2002,20(4):27-31. 被引量：14
7韦煜明,袁鼎荣,陈宏朝.一种新的频繁集的挖掘算法[J].广西工学院学报,2003,14(2):38-41.
8袁鼎荣,张师超.基于频繁链表的频繁集的挖掘算法[J].计算机科学,2003,30(7):165-166. 被引量：5
9刘美玲,徐章艳,卢景丽,区玉明,袁鼎荣,吴信东.利用项集有序特性改进Apriori算法[J].广西师范大学学报（自然科学版）,2004,22(1):33-37. 被引量：11

同被引文献32

1苏毅娟,严小卫.一种改进的频繁集挖掘方法[J].广西师范大学学报（自然科学版）,2001,19(3):22-26. 被引量：10
2奎恩,雷默著.生物信息学概论[M].孙啸,陆祖宏,谢建明,等译.北京:清华大学出版社,2004.
3SMITH T F,WATERMAN M S. Identification of common molecular subsequences[J]. Journal of Molecular Biology, 1981,147(1) : 195-197.
4HUANG Xiao-qiu,HARDISON R C,MILLER W. A space-efficient algorithm for local similarities[J]. Computer Applications in the Biosciences, 1990,6 (4) : 373-381.
5YONG Gao,HENDERSON M. Speeding up pairwise sequence alignments :a scoring scheme reweighting based approach[C]//Proceedings of the 7th IEEE International Conference on Bioinformaties and Bioengineering. Washington DC :IEEE Computer Society, 2007 : 1194-1198.
6LIPMAN D,PEARSON W. Rapid and sensitive protein similarity searches[J]. Science, 1985,227:1435-1441.
7ALTSCHUL S F,GISH W,MILLER W,et al. Basic loyal alignment search tool[J]. Journal of Molecular Biology, 1990,215:403-410.
8LI Ming,MA Bin.TROMP J. PatternHunter : faster and more sensitive homology search [J]. Bioinformatics, 2002,18 (3) : 440-445.
9NEEDLEMAN S B,WUNSCH C D. A general method applicable to the search for similarities in the amino acid sequence of two proteins[J]. Journal of Molecular Biology, 1970,48 : 443-453.
10KARLIN S,ALTSCHUL S F. Methods for accessing the statistical significance of molecular sequence features by using general scoring schemes[J]. Proc Natl Acad Sci, 1990,87 : 2264-2268.

引证文献4

1刁哲军,吴欣明,靳慧龙,许成谦.似最佳自相关序列偶的研究[J].广西师范大学学报（自然科学版）,2005,23(3):17-20. 被引量：1
2张兵.一种网络日志挖掘的高效算法[J].广西师范大学学报（自然科学版）,2006,24(1):26-29. 被引量：2
3李晓凯,郭红.一种可变长子片段对拼接的DNA双序列局部比对算法[J].广西师范大学学报（自然科学版）,2008,26(4):53-57.
4蔡宏果,元昌安,罗锦光,张增银,石亚冰.基于server session约束的序列模式增长挖掘研究[J].郑州大学学报（理学版）,2010,42(1):24-28. 被引量：1

二级引证文献4

1赵燕,陈晓云,莫明辉,汤勇.基于用户群的智能主题爬虫[J].广西师范大学学报（自然科学版）,2007,25(2):230-233. 被引量：3
2许成谦,彭秀平.序列偶设计研究综述[J].燕山大学学报,2012,36(4):283-292.
3吴建军.网络舆情的云计算监测模式分析与实现[J].电讯技术,2013,53(4):476-481. 被引量：4
4朱华旻,周振吉,吴礼发,王海波.一种多云环境的资源及应用监控方法SEPQMS[J].郑州大学学报（理学版）,2017,49(3):45-51.

1黎敏,仇洪冰,郑继禹.神经网络在ATM网络流量预测中的应用[J].桂林电子工业学院学报,1998,18(4):11-14.
2丛丹,王俊普,杨文,张绍一.一种新的关联规则的高效挖掘算法[J].计算机应用研究,2003,20(11):57-58. 被引量：4
3TKD.你买的本本缩水了吗?[J].电脑爱好者,2007,0(22):80-80.
4王玮,蔡莲红.关联规则的高效挖掘算法研究[J].小型微型计算机系统,2002,23(6):708-710. 被引量：5
5TKD.认清本本命名规则帮大忙宏碁篇[J].电脑爱好者,2007,0(24):81-81.
6高春昱,张杰辉,张婧.浅谈实验室客服中心证书管理的方式与利弊[J].现代测量与实验室管理,2016,24(1):63-64.
7陈银凤.海量高光谱遥感图像数据库的高效挖掘算法研究[J].科技通报,2015,31(3):188-191. 被引量：2
8米老鼠.GIGA主板的编号规则[J].电脑新时代,2001(5):24-25.
9刘艳云.基于改进关联规则的网络入侵检测方法的研究[J].通信技术,2008,41(12):316-318. 被引量：4
10崔丽群,张明杰,吴凡.基于边缘信息车流量检测方法的研究[J].计算机应用与软件,2014,31(12):249-252. 被引量：1

广西师范大学学报（自然科学版）

2004年第4期

浏览历史

内容加载中请稍等...

NPSP:一种高效的序列模式增量挖掘算法被引量：4

参考文献6

二级参考文献7

共引文献9

同被引文献32

引证文献4

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

NPSP:一种高效的序列模式增量挖掘算法 被引量：4

参考文献6

二级参考文献7

共引文献9

同被引文献32

引证文献4

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

NPSP:一种高效的序列模式增量挖掘算法被引量：4