摘要
Apriori关联规则数据挖掘算法只针对一类相关数据集进行数据挖掘,而现实世界中各种不同的数据集非常庞大,如何在不相关数据集间进行数据挖掘,拓展规则的数量具有挑战性。目前Apriori关联规则算法研究基本上集中在算法性能优化和针对不同数据形式的基础上,没有突破不相关数据集的界限。针对这个问题,首先给出了相关数据集、不相关数据集、相容数据集的概念,进一步给出了一种基于Apriori的不相关数据集中相容数据集间的关联规则演绎算法,给出了算法演绎规则,通过构建法证明了算法的正确性。通过实例演示了应用方法,该算法可实现基于Apriori的相容数据集间关联规则的规则演绎,是普通数据挖掘算法无法实现的,扩展了关联规则算法的应用领域;同时,由于关联规则是在相容数据集上独立挖掘出来的,没有进行原始数据间的交换,在一定程度上实现了隐私保护。
Data mining algorithm based on Apriori of association rules mines data only for a class of correlated datasets. However, various datasets are very large in the real world, and how to mine data among uncorrelated datasets and how to expand the number of rules are the challenging issues. The study of Apriofi algorithm of association rules basically focus on the performance optimization of algorithm and different data forms at present, which does not breakthrough the limit of the uncorrelated datasets. For this, the concepts of correlated datasets, uneorrelated datasets and compatible datasets were given in the paper, furthermore a deductive method of association rules among uncorrelated datasets based on Apriori was given in this paper, and in which deductive rules of the algorithm were given. The correctness of the algorithm was proved by construction method, and the application method was demonstrated by examples. The algorithm can realize rules deduction among correlated rules based on Apriori for uncorrelated datasets, which cannot be realized by common data mining algorithms. The algorithm expands the application field of correlated rules algorithm; meanwhile, it realizes the privacy protection in a certain extent because the rules are mined independently out on the basis of compatible datasets and have not shared original data.
出处
《计算机应用》
CSCD
北大核心
2013年第10期2796-2800,共5页
journal of Computer Applications
基金
国家自然科学基金资助项目(61261025)
内蒙古自然科学基金资助项目(2012MS0913)