摘要
关联规则是数据挖掘研究的一个重要课题 ,而最大频繁项集的生成是影响关联规则挖掘的关键问题 .在已有的频繁集发现算法中 ,DLG算法通过减少事务数据库的扫描次数 ,进而有效减少挖掘过程的I/O代价 .在阐述DLG算法的实现原理与执行过程的基础上 ,为进一步减少候选项集的数量 ,提出一种改进算法DLG .其主要思想是在关联图构造阶段 ,统计每一个频繁项目的入度 ,以此作为剪枝的依据 .
Mining association rules is an important part of data mining field. An d generating the frequent itemsets is a key problem of mining association rules. Among the proposed algorithms of finding frequent itemsets, DLG is a efficient algorithm to controls I/O cost by reducing the number of database passes. The pr inciple and implemental process are discussed, and then a revised algorithm is p resented based on DLG in order to cut down the number of candidates further. The p rinciple of DLG is to count the in-degree of each frequent itemset on which a p runing is based in the phase of constructing graphs. Finally, the performance an alysis and comparison experiments are done and the result shows the algorithm is excellent.
出处
《山东大学学报(工学版)》
CAS
2004年第1期99-103,共5页
Journal of Shandong University(Engineering Science)
关键词
关联规则
关联图
比特向量
association rules
association
bit vector