摘要
为进一步解决对大型数据库进行关联规则挖掘时产生的CPU时间开销大和I/O操作频繁问题,给出一种改进的关联规则挖掘算法(ARMAC).该算法引入有向无环图和tidlist结构用以提高频繁项目集的计算效率,并将数据库划分为内存可以满足要求的若干部分,解决了对大型数据库挖掘时磁盘操作频繁的问题,从而有效地适用于大型数据库的关联规则挖掘.该算法吸取连续关联规则挖掘(CARMA)算法的优势,只需扫描两次数据库便可完成挖掘过程.实验结果表明:该算法在大型事务数据库中具有更高的执行效率.
To further reduce both the large overhead of CPU and frequent operation of I/O occurred in the process of the association rules mining on the large transaction database,this paper presents an improved algorithm of association rule mining(ARMAC).In this algorithm,a directed acyclic graph(DAG) and the tidlist configuration are taken to improve the computing efficiency of the frequent item sets,and the database is partitioned into several parts whose RAM can meet the corresponding demand,thus overcoming the problems of disk’s frequent operation on mining the large database,which is effectively applied to the association rule mining of large database.Taking advantages of the algorithm of continuous association rule mining(CARMA),this improved algorithm can implement the mining by only scanning the database twice.Experimental results show that this proposed algorithm is of higher execution efficiency in large transaction database.
出处
《空军雷达学院学报》
2011年第3期205-208,共4页
Journal of Air Force Radar Academy
关键词
数据挖掘
频繁项集
大型数据库
有向无环图
关联规则
data mining
frequent item sets
large database
directed acyclic graph(DAG)
association rules