摘要
首先将事务数据库压缩存储到一棵事务线索树(TT-tree)的结点上,并建立这些结点的索引表,然后寻找结点索引表的最后结点到根结点的全部路径,这些路径及路径的交集包含了用于挖掘关联规则的频繁集.该算法只需扫描事务数据库一次,由于采用了逆向搜索TT-tree的方法,搜索的时间开销非常少.该算法可以挖掘中短模式的海量数据,具有很好的伸缩性,同时该算法具有增量挖掘的功能.通过大量的实验数据进行比较,该算法的速度约是Apriori算法的10倍.
A novel incremental mining algorithm of association rules is presented in this paper. First, transaction database is compressed and stored in a transaction thread tree (TT-tree). Then the index table of the nodes is established. Finally, all paths from leaf node to root node are obtained with the reverse search method. The frequent sets are included in these paths. The algorithm is very efficient since it scans transaction database only one time. In addition to efficiency, our algorithm is both scalable and incremental. The experimental results show that our algorithm is 10 times faster than that of the Apriori method.
出处
《应用科学学报》
CAS
CSCD
2004年第2期200-204,共5页
Journal of Applied Sciences
基金
国家自然科学基金(30271048)
江苏省九五重点攻关课题(BJ98017-1)
江苏省十五高科技(BJ2001013)
校科研基金重点课题(X02-070-1(Z))资助项目