期刊文献+

极大频繁子树挖掘及其应用 被引量:4

Maximum Frequent Tree Mining and its Applications
下载PDF
导出
摘要 极大频繁子树挖掘在Web挖掘、HTML/XML文档分析、生物医学信息处理等领域有着重要的应用,可用于解决这些领域的自同构问题。本文提出了一种极大频繁子树挖掘算法(MFTM)。MFTM基于最右路径扩展技术,在搜索过程中,采用覆盖定理进行裁剪,压缩搜索空间,从而极大地加快了算法的收敛速度。性能实验表明,极大频繁挖掘等算法是有效和可伸缩的。 A novel algorithm called Maximum Frequent Tree Mining (MFTM) is presented to discover maximum frequent sub-trees from forest. MFTM uses the right-most path expansion technique. The Overlay Theorem is proposed to reduce the search space and accelerate the convergence speed. We conduct detailed experiments to test the perform- ance and scalability of the methods. The experiments demonstrate that MFTM is effective and scalable. MFTM can be applied to solve the isomorphic problems in the domains such as Web mining, HTML/XML data analysis, bioinformatics, and so on.
作者 杨沛 谭琦
出处 《计算机科学》 CSCD 北大核心 2008年第2期150-153,共4页 Computer Science
基金 国家自然科学基金(60003019)资助
关键词 频繁子树挖掘 WEB挖掘 信息抽取 Frequent tree mining, Web mining, Data extraction
  • 相关文献

参考文献12

  • 1Cooley R, Mobasher B, Srivastava J. Web Mining: Information and Pattern Discovery on the World Wide Web. In: 8th IEEE Intl Conf on Tools with AI, 1997.
  • 2Li Q, Moon B. Indexing and querying XML data for regular path expressions. In- 27th Int'1 Conf. on Very Large Data Bases, 2001.
  • 3Shapiro B, Zhang K. Comparing multiple RNA secondary strutures using tree comparisons. Computer Applications in Biosciences, 1990,6(4) :309-318.
  • 4Inokuehi A, Washio T, Motoda H. An apfiori-based algorithm for mining frequent substructures from graph data. In: 4th European Conference on Principles of Knowledge Discovery and Data Mining, September 2000.
  • 5Kuramochi M,Karypis G. Frequent subgraph discovery. In: 1st IEEE Int'1 Conf. on Data Mining, November 2001.
  • 6Cook D, Holder L. Substructure discovery using minimal description length and background knowledge. Journal of Artificial Intelligence Research, 1994,1:231-255.
  • 7Yoshida K, Motoda H. CLIP: Concept learning from inference patterns. Artificial Intelligence, 1995,75(1):63-92.
  • 8Asai T, Abe K, Kawasoe S, et al. Effecient substructure discovery from large semi-structured data. In: 2nd SIAM Int'1 Conference on Data Mining, April 2002.
  • 9Zaki M J. Efficiently mining frequent trees in a forest. In: SIGKDD'2002 Edmonton, Alberta, Canada.
  • 10杨沛,郑启伦,彭宏,李颖基.PFTM:一种基于投影的频繁子树挖掘算法[J].计算机科学,2005,32(2):206-209. 被引量:5

二级参考文献13

  • 1Cook D, Holder L. Substructure discovery using minimal description length and background knowledge. Journal of Arti_cial Intelligence Research, 1994,1: 231~ 255.
  • 2Yoshida K, Motoda H. CLIP: Concept learning from inference patterns. Artificial Intelligence, 1995,75 (1):63~ 92.
  • 3Asai T,Abe K,Kawasoe S,Arimura H,Satamoto H,Arikawa S.Effecient substructure discovery from large semi-structured data.In:2nd SIAM Int'l. Conf. on Data Mining,April 2002.
  • 4Zaki M J. Efficiently mining frequent trees in a forest. In SIGKDD'2002 Edmonton, Alberta, Canada.
  • 5Cooley R,Mobasher B, Srivastava J. Web Mining: Information and Pattern Discovery on the World Wide Web. In: 8th IEEE Intl. Conf. on Tools with AI,1997.?A?A?A?A
  • 6Li Q,Moon B. Indexing and querying XML data for regular path expressions. In: 27th Int'l. Conf. on Very Large Data Bases,2001.
  • 7Shapiro B,Zhang K. Comparing multiple RNA secondary strutures using tree comparisons. Computer Applications in Biosciences,1990,6(4) :309~318.
  • 8Inokuchi A,Washio T,Motoda H. An apriori-based algorithm for mining frequent substructures from graph data. In: 4th European Conf. on Principles of Knowledge Discovery and Data Mining,Sep. 2000.
  • 9Kuramochi M,Karypis G. Frequent subgraph discovery. In: 1st IEEE Int'l Conf. on Data Mining,Nov. 2001.
  • 10Agrawal R, Srikant R. Fast algorithms for mining association rules. In VLDB'94,Santiago,Chile,Sept. 1994. 487~499.

共引文献4

同被引文献21

  • 1朱永泰,王晨,洪铭胜,汪卫,施伯乐.ESPM——频繁子树挖掘算法[J].计算机研究与发展,2004,41(10):1720-1727. 被引量:18
  • 2赵传申,孙志挥,张净.基于投影分支的快速频繁子树挖掘算法[J].计算机研究与发展,2006,43(3):456-462. 被引量:14
  • 3J.Han,M.Kamber.Data Mining:Concepts and Techniques.Morgan Kaufmann Publishers,2000.
  • 4U M Fayyad,G P ShaPiro,P Smyth and R Uthurusamy:Advances in Knowledge Discovery and Data Mining.AAAI/MIT Press,1996.
  • 5T.Asai,K.Abe,S.Kawasoe,H.Arimura,H.Sakamoto,S.Arikawa,Effieient Substructure Discovery from Large Semistruetured Data.In Proceedings of the 2nd SIAM International Conference on Data Mining,2002,2431:57-100.
  • 6L Zou,Y Lu,H Zhang,R Hu.Mining Frequent Induced Subtree Patterns with Subtree-Constraint.Proceedings of the 6th IEEE International Conference on Data Mining-Workshops (ICDMW2006),Hongkong,China,December,2006:3-7.
  • 7H Tan.T S Dillon,F Hadzic,E Chang,L Feng.IMB3-Miner Mining Induced/Embedded Subtrees by Constraining the Level of Embedding.Proceedings of the Pacific-Asia Conference on Knowledge.Discovery and Data Mining,Singapore,2006:450-461.
  • 8M Seno,G Karypis.Finding Frequent Patterns Using LengthDecreasing Support Constraints.Data Mining and Knowledge Discovery,2005,10(3):197-228.
  • 9Y.Chi,Y.Yang,Y.Xia,R.R.Muntz,CMTreeMiner,Mining Closed and Maximal Frequent Subtrees from Databases of Labeled Rooted Trees.Lecture Notes in Computer Science,2004,3056:63-73.
  • 10M. J. Zaki. CSLOGS Data. 2003-8-6[2007-11-1] http:// www.cs.rpi.edu/~zaki/software.

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部