期刊文献+

分布式环境下的序列模式发现研究 被引量:1

The Research Sequential Pattern Discovery in Distributed Environment
原文传递
导出
摘要 提出一种称为DMSP(DistributedMiningofSequentialPatterns)的算法,以解决分布式环境下的序列模式挖掘问题.其主要思想是:利用前缀投影技术划分模式搜索空间,降低数据库的规模,生成局部序列模式;利用模式前缀指定选举站点降低通信开销;多线程异步运行,提高算法的并行性.实验结果显示:在具有海量数据的局域网环境中,DMSP算法的性能优于将数据集中后采用GSP算法65%以上. An algorithm called DMSP (Distributed Mining of Sequential Patterns) is proposed in order to deal with mining sequential patterns in distributed environment. The main idea is that each site utilizes prefix-projected technique which divides the pattern search space and decreases the size of the database to generate local sequential patterns; each site utilizes polling site associated with prefix to decrease the cost of communication; multi-threads run asynchronously in each site to increase the concurrency of algorithm. The experiments show that algorithm DMSP is outperforming applying algorithm GSP after centralizing data by above 65 percent and scaleable over LAN with huge amount of data.
出处 《复旦学报(自然科学版)》 CAS CSCD 北大核心 2004年第5期737-741,共5页 Journal of Fudan University:Natural Science
基金 国家自然科学基金资助项目(70171052 60075015)
关键词 序列模式挖掘 分布式环境 算法 多线程 海量数据 局域网 并行性 低通 开销 投影技术 data mining sequential pattern distributed algorithm
  • 相关文献

参考文献10

  • 1Agrawal R, Srikant R. Mining sequential patterns[A]. In: Philip S Y, Arbee L, Chen P,eds.Proceedings of the International Conference on Data Engineering[C]. Taipei:IEEE Computer Society, 1995. 3-14.
  • 2Agrawal R, Srikant R. Mining sequential patterns: Generalizations and performance improvements[A]. In: Jarke M,ed.Proceeding of the International Conference on Extending Database Technology[C]. Colorado, USA:Springer-Verlag, 1996. 3-17.
  • 3Han J, Pei J, Mortazavi-Asl B, et al. PrefixSpan: Mining sequential patterns efficiently by Prefix-Projected pattern growth [A].In: Alex G, Per-Ake L,eds. Proceedings of the International Conference on Data Engineering[C]. Heidelberg, Germany:IEEE Press,
  • 4Parthasarathy S, Zaki M J, Ogihara M, et al. Incremental and interactive sequence mining[A]. In: Fredric G,ed.Proceedings of the 8th International Conference on Information and Knowledge Management[C]. Kansas City, Missouri, USA:ACM, 1999. 251-258.
  • 5Masseglia F, Poncelet P, Teisseire M. Incremental mining of sequential patterns in large databases[EB/OL]. Http://citeseer.nj.nec.com/masseglia00incremental.html, 2000-01-10/2003-12-12.
  • 6Guralnik V, Garg N, Karypis G. Parallel tree projection algorithm for sequence Mining[J]. Lecture Notes in Computer Science, 2001, 2150:310-320.
  • 7Zaki M J. Parallel sequence mining on shared-memory machines [J]. Journal of Parallel and Distributed. Computing, 2001, 61:401-426.
  • 8Cheung D, Han J, Vincent T Ng, et al. A fast distributed algorithm for mining association rules[A]. In: Wei S,Naughton J,eds.Proceedings of International Conference on Parallel and Distributed Inforamtion Systems[C]. Miami Beach, Florida: IEEE Computer So
  • 9Kargupta H, Park B, Hershbereger D, et al. Collective data mining: A new perspective toward distributed data mining[A]. In: Kargupta H, Chan P,eds. Accepted in the Advances in Distributed Data Mining[M]. Cambridge MA:AAAI/MIT Press,1999.
  • 10邹翔,张巍,蔡庆生,王清毅.大型数据库中的高效序列模式增量式更新算法[J].南京大学学报(自然科学版),2003,39(2):165-171. 被引量:10

二级参考文献15

  • 1Agrawal R, Srikant R. Mining sequential patterns. Proceedings of the International Conference on Data Engineering. IEEE Computer Society, 1995: 3-14.
  • 2Agrawal R, Srikant R. Mining sequential patterns: Generalizations and performance improvements.Proceeding of the International Conference on Extending Database Technology. New York: Springer-Verlag, 1996: 3-17.
  • 3Bettini C, Sean Wang X, Jajodia S. Mining temporal relationships with multiple granularities in time sequences. Data Engineering Bulletin, 1998, 21: 32-38.
  • 4Ozden B, Ramaswamy S, Silberschatz A. Cyclic association rules. Proceedings of the International Conference on Data Engineering. IEEE Press, 1998: 412-421.
  • 5Garofalakis M, Rastogi R, Shim K. Spirit: Sequential pattern mining with regular expression constraints.Proceedings of the International Conference on Very Large DataBases. San Franciso: Morgan Kaufmann Publishers Inc, 1999: 223-234.
  • 6Han J, Pei J, Mortazavi-Asl B, et al. Freespan: Frequent pattern-projected sequential pattern mining.Proceedings of the International Conference on Knowledge Discovery and Data Mining. ACM, 2000:355-359.
  • 7Han J, Pei J, Mortazavi-Asl B, et al. PrefixSpan: Mining sequential patterns effieiently by prefix-projected pattern growth. Proceedings of the International Conference on Data Engineering. IEEE Press,2001 : 215-226.
  • 8Cheung D W, Han J, Ng V T, et al. Maintenance of discovered association rules: An incremental update technique. Proceedings of the 12th International Conference on Data Engineering. IEEE Press, 1996:106-114.
  • 9Cheung D W, Lee S D, Kao B. A general incremental technique for maintaining discovered associationrules. Proceedings of the Fifth International Conference on Database Systems for Advanced Applications.Singapore: World Scientific Publishing, 1997: 185-194.
  • 10Wang K. Discovering patterns from large and dynamic sequential data. Journal of Intelligent Information System, 1997: 8-33.

共引文献9

同被引文献7

  • 1[1]Agrawal R,Srikant R.Mining Sequential Patterns[C]//Philip S Y,Arbee L,Chen P,et al.Proceedings of the International Conference on Data Engineering.Taipei:IEEE Computer Society,1995:3-14.
  • 2[2]Srikant R,Agrawal R.Mining Sequential Patterns:Generalization and Performance Improvements[C]//Jarke M.Proceeding of the International Conference on Extending Database Technology.Colorado:Spring Verlag,1996:3-17.
  • 3[3]Han J,Pei J,Mortazavi-Asl B,et al.Prefixspan:Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth[C]//AlexG,Per-Ake L.Proceedings of the International Conference on Data Engineering.Heidelberg:TEEE Press,2001:115-116.
  • 4[4]Guralnik V,Garg N,Karypis G.Parallel Tree Projection Algorithm for Sequence Mining[J].Lecture Notes in Computer Science,2001,2150:310-320.
  • 5[5]Zaki M J.Parallel Sequence Mining on Shared-Mmemory Machines[J].Journal and Distributed Computing,2001,61:401-426.
  • 6[7]Godin R,Missaoui R.Alaui H.Incremental Concept Formation Algorithms Based on Galois (concept) Lattices[J].Computational Intelligence,1995,11 (2):246-267.
  • 7孙莹,胡学钢.基于频繁概念格的序列模式发现研究[J].计算机科学,2004,(S2):168-171.

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部