期刊文献+

高效挖掘无序频繁子树 被引量:6

Efficiently Mining Unordered Frequent Trees
下载PDF
导出
摘要 频繁模式挖掘是数据挖掘领域的中一个重要问题,其研究范围包括事务,序列,树和图.频繁子树挖掘广泛应用于生物信息学,web挖掘,化合物结构分析和挖掘等领域.本文提出用模式增长方法在由无序树构成的森林中挖掘直接频繁子树.算法利用规范化方法将无序树化为为唯一的表示形式,利用最右路径扩展方法构造完整的模式增长空间,然后根据待增长模式的拓扑结构确定其增长点并构造相应投影库,从而将挖掘频繁子树模式问题转化为在各投影库中寻找频繁节点问题.通过与HybridTreeMiner算法的实验比较,表明其具有更高的效率. Frequent patterns mining is an important problem in data mining domain. It involves mining transactions ,sequences, trees and graphs. Methods for mining frequent trees are widely used in domains like bioinformatics,web-mining, chemical data structure mining,and so on. In this paper,an efficient pattern growth algorithm is presented for mining frequent induced subtrees in a forest of rooted ,labeled, and unordered trees. It uses a breadth-first canonical form to represent unordered trees in a unique way. It uses rightmost path expansion schema to construct complete pattern growth space, and creates a projection database for every grow point of the pattern ready to grow. Then,the problem is transformed from mining frequent trees to finding frequent nodes in the projected database. Experiments show that it has better performance than HybridTreeMiner, one of the fastest methods proposed before.
出处 《小型微型计算机系统》 CSCD 北大核心 2006年第11期2104-2108,共5页 Journal of Chinese Computer Systems
关键词 知识发现 数据挖掘 频繁模式 频繁子树 knowledge discovery data mining frequent patterns frequent subtrees
  • 相关文献

参考文献14

  • 1Agrawal R,Imielinski T,Swami A.Mining association rules between sets of items in large databases[C].In Proceedings of SIGMOD 1993,Washington,America 1993
  • 2Wang K,Liu H.Schema discovery for semistructured data[C].In Proceedings of KDD 1997,Newport Beach,Canada.1997
  • 3Asia T,etal.Efficient substructure discovery from large semi-structured dat[C].In:Proceedings of SIAM 2002,Arlington,VA.America.2002.
  • 4Chi Y,Yang Y,Muntx R R.Index and mining free trees[C].In:Proceedings of ICDM,2003,Melbourne,Florida,America.2003.
  • 5Zaki M J.Efficiently mining frequent trees in a forest[C].In:Proceedings of KDD 2002,Edmonton,Alberta,Canada.2002
  • 6Wang Chen,Hong Ming-sheng,Pei Jian,etal.Efficient pattern-growth methods for frequent tree pattern mining[C].In:Proceedings of PAKDD 2004,Sydney,Australia,2004.
  • 7Yun Chi,Yirong Yang,Richard R.Muntz.HybridTreeMiner:An efficient algorithm for mining frequent rooted trees and free trees using canonical forms[C].In:Proceedings of SSDBM 2004.Paris,France.2004.
  • 8Han J,etal.Mining frequent patterns without candidate generation[C].In:Proceedings of SIGMOD 2000,Dallas,TX.America.2000.
  • 9Pei J,etal.H-Mine:Hyper-structure mining of frequent patterns in large database[C].In:Proceedings of ICDM 2001,San Jose,Canada.2001.
  • 10Pei J,etal.PrefixSpan:mining sequential patterns efficiently by prefix-projected pattern growth[C].In:Proceedings of ICDE 2001,Heidelberg,Germany.2001.

同被引文献48

引证文献6

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部