期刊文献+

NPSP:一种高效的序列模式增量挖掘算法 被引量:4

NPSP:AN EFFICIENT ALGORITHM WITH INCREMENTAL DATA MINING FOR MINING SEQUENTIAL PATTERNS
下载PDF
导出
摘要 提出了一种称为"异构树"的数据结构,采用一套编号规则对异构树的分支进行编号,使具有相同编号的分支代表相同的候选序列,编号不同的分支代表不同的候选序列,极大地简化了候选集计数过程.在此基础上提出了具有增量挖掘功能的序列模式高效挖掘算法NPSP,并从理论分析和实验两方面证明了其挖掘结果集的完备性和算法的高效性. The GSP and the PSP are the main two algorithms for mining sequential patterns.But neither of those algorithms has the function of incremental data mining and their efficiency is lower.In this paper,a data structure called Heterogeneity Tree is presented and a set of rules is used to number the branches of the Heterogeneity Tree.The rules ensure that the branches which have the same serial numbers represent the same candidates and the branches which have different serial numbers represent different candidates so that the process of counting the support of candidates is simplified.Based on those,an efficient algorithm with the function of incremental data mining for mining sequential patterns is obtained.Finally the completeness of the mined set and efficiency of the algorithm NPSP by theories and experiment are proved.
出处 《广西师范大学学报(自然科学版)》 CAS 2004年第4期22-26,共5页 Journal of Guangxi Normal University:Natural Science Edition
基金 澳大利亚ARC基金资助项目(DP0343109)
关键词 数据挖掘 序列模式 NPSP算法 增量挖掘 data mining sequence patterns NPSP algorithm incremental data mining
  • 相关文献

参考文献6

  • 1Agrawal Rakesh,Srikant Ramakrishnan.Mining sequential patterns[A].Proceedings of the 11th international conference on data engineering[C].Los Alamitos,CA:IEEE Computer Society Press,1995.3-14.
  • 2Srikant Ramakrishnan,Agrawal Rakesh.Mining sequential patterns:generalizations and performance improvements[A].Proceedings of the 5th international conference on extending database technology[C].Berlin:Springer-Verlag,1996.3-17.
  • 3Masseglia F,Cathala F,Poncelet P.The PSP approach for mining sequential patterns[A].Proceedings of the 2nd European symposium on principles of data mining and knowledge discovery[C].Berlin:Springer-Verlag,1998.176-184.
  • 4Mueller A.Fast sequential and parallel algorithms for association rule mining:a comparison(technical report CS-TR-3515)[R].College Park:University of Maryland,1995.
  • 5苏毅娟,严小卫.一种改进的频繁集挖掘方法[J].广西师范大学学报(自然科学版),2001,19(3):22-26. 被引量:10
  • 6Agrawal R,Srikant R.Fast algorithms for mining association rules in large databases[A].Proceedings of the 20th international conference on very large databases[C].San Mateo:Morgan Kaufmann Publishers,1994.487-499.

二级参考文献7

  • 1Cheung D,Vincent T.Efficient mining of association rules in distributed databases[J].IEEE Transactions on Knowledge and Data Engineering,1996,8(6):911-922.
  • 2Agrawal R,Imielinski T,Swamy A.Mining association rules between sets of items in large databases[A].Proceedings of ACM SIGMOD International conference on Management of Data[C].Washington:Springer-Verlag,1993.458-466.
  • 3Li Shen,Hong Shen,Ling Cheng.New algorithms for efficient mining of association rules[J].Information Sciences,1999,118(4):251-268.
  • 4Bing Liu,Wynne Hsu,Lai-Fun Mun,Hing-Yan Lee.Finding interesting patterns using user expections[J].IEEE Transactions on Knowledge and Data Engineering,1999,11(6):817-832.
  • 5Chen M,Han J,Yu P S.Data Mining:An overview from database perspective[J].IEEE Transactions on Knowledge and Data Engineering,1996,8(6):866-883.
  • 6陆丽娜,陈亚萍,魏恒义,杨麦顺.挖掘关联规则中Apriori算法的研究[J].小型微型计算机系统,2000,21(9):940-943. 被引量:140
  • 7马献明,严小卫,陈宏朝.个性化网上信息代理技术的研究概述[J].广西师范大学学报(自然科学版),2000,18(3):40-44. 被引量:19

共引文献9

同被引文献32

  • 1苏毅娟,严小卫.一种改进的频繁集挖掘方法[J].广西师范大学学报(自然科学版),2001,19(3):22-26. 被引量:10
  • 2奎恩,雷默著.生物信息学概论[M].孙啸,陆祖宏,谢建明,等译.北京:清华大学出版社,2004.
  • 3SMITH T F,WATERMAN M S. Identification of common molecular subsequences[J]. Journal of Molecular Biology, 1981,147(1) : 195-197.
  • 4HUANG Xiao-qiu,HARDISON R C,MILLER W. A space-efficient algorithm for local similarities[J]. Computer Applications in the Biosciences, 1990,6 (4) : 373-381.
  • 5YONG Gao,HENDERSON M. Speeding up pairwise sequence alignments :a scoring scheme reweighting based approach[C]//Proceedings of the 7th IEEE International Conference on Bioinformaties and Bioengineering. Washington DC :IEEE Computer Society, 2007 : 1194-1198.
  • 6LIPMAN D,PEARSON W. Rapid and sensitive protein similarity searches[J]. Science, 1985,227:1435-1441.
  • 7ALTSCHUL S F,GISH W,MILLER W,et al. Basic loyal alignment search tool[J]. Journal of Molecular Biology, 1990,215:403-410.
  • 8LI Ming,MA Bin.TROMP J. PatternHunter : faster and more sensitive homology search [J]. Bioinformatics, 2002,18 (3) : 440-445.
  • 9NEEDLEMAN S B,WUNSCH C D. A general method applicable to the search for similarities in the amino acid sequence of two proteins[J]. Journal of Molecular Biology, 1970,48 : 443-453.
  • 10KARLIN S,ALTSCHUL S F. Methods for accessing the statistical significance of molecular sequence features by using general scoring schemes[J]. Proc Natl Acad Sci, 1990,87 : 2264-2268.

引证文献4

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部