摘要
时域数据的挖掘是数据挖掘领域经常遇到的问题。而时域关联规则的发现研究是关联规则的一个重要研究课题。该文在对周期关联规则进行深入研究的基础上,形式化定义了基本的时域关联规则概念,并提出了基于Apriori的发现周期关联规则的CCAR算法。CCAR的核心思想是首先把各项目按照周期时间分布进行聚类,根据聚类结果把每个项目分成几个动态的有效时间区域。在应用Apriori算法时,用项目的各个时间区域扩展项目集I,然后根据作者提出的带时间属性的JOIN操作由Lk-1生成Ck,并由约简操作删除Ck中不满足条件的候选频繁项目集以提高算法的效率。算法理论分析和实验都表明CCAR是有效的。
Time-based association rules discovery is a key task for association rules research. On the basis of deeply study of cycle association rules, the paper has formalized the basic conception of cycle association rules, and developed Apriori-based CCAR algorithm, which is used to discovery cycle association rules. The key thoughts of CCAR is that first it clusters items according to its cycle time distribution, partitions each item into several efficient time intervals by the clusters; While applying Apriori, extend itemset I with each item's intervals, and then generate k candidate frequent itemsets with k-1 frequent itemsets by using time-based JOIN operation developed in the paper, and remove unreasonable candidate frequent itemsets from Ck to improve the CCAR's efficiency. Both the theoretic analysis and the test have prove its efficiency.
出处
《计算机仿真》
CSCD
2005年第7期36-39,70,共5页
Computer Simulation
关键词
关联规则
聚类
周期关联规则
Association rule
Clustering
Cycle association rule