摘要
通过对文献[1]中介绍的基于图的关联规则挖掘算法的详细分析,提出了一种新的基于完全子图的关联规则挖掘算法。该算法利用完全子图与频繁项集的对应关系,以完全子图结点的度作为判断标准,完全避免了不必比较项目之间的比较;同时通过设置关联图结点的order值,完全避免了相同项目集的重复比较,从而在寻找k-项频繁集(≥3)的过程中,时间复杂度远小于原算法的k-11。因此,该算法减小了存储空间,加快了挖掘速度,提高了算法的效率。
After the method for mining association rules based on graph introduced in the reference [1] is analyzed, a new algorithm for mining association rules based on complete sub-graph is put forward. On the basis of the connection between complete sub-graph and large itemsets, the algorithm regards the degree of the nodes in complete sub-graph as a criterion for avoiding the compare among the items which needn't been compared. At the same time, by setting a different order for each node, the algorithm avoids some repeated compare for the same large itemsets. So during looking for k-large itemsets (k 〉 3), the time the new algorithm needs is much less than 1/ k-1 of the time the former needs. As a result, the algorithm occupies less memory and quickens the speed of mining, so it improves the efficiency of mining.
出处
《计算机工程与设计》
CSCD
北大核心
2006年第23期4475-4478,4493,共5页
Computer Engineering and Design
关键词
关联规则
数据挖掘
关联图
完全子图
频繁项集
度
association rules
data mining
relation graph
complete sub-graph
large itemsets
degree