Apriori algorithm is often used in traditional association rules mining,searching for the mode of higher frequency.Then the correlation rules are obtained by detected the correlation of the item sets,but this tends to...Apriori algorithm is often used in traditional association rules mining,searching for the mode of higher frequency.Then the correlation rules are obtained by detected the correlation of the item sets,but this tends to ignore low-support high-correlation of association rules.In view of the above problems,some scholars put forward the positive correlation coefficient based on Phi correlation to avoid the embarrassment caused by Apriori algorithm.It can dig item sets with low-support but high-correlation.Although the algorithm has pruned the search space,it is not obvious that the performance of the running time based on the big data set is reduced,and the correlation pairs can be meaningless.This paper presents an improved mining algorithm with new association rules based on interestingness for correlation pairs,using an upper bound on interestingness of the supersets to prune the search space.It greatly reduces the running time,and filters the meaningless correlation pairs according to the constraints of the redundancy.Compared with the algorithm based on the Phi correlation coefficient,the new algorithm has been significantly improved in reducing the running time,the result has pruned the redundant correlation pairs.So it improves the mining efficiency and accuracy.展开更多
基金This research was supported by the National Natural Science Foundation of China under Grant No.61772280by the China Special Fund for Meteorological Research in the Public Interest under Grant GYHY201306070by the Jiangsu Province Innovation and Entrepreneurship Training Program for College Students under Grant No.201810300079X.
文摘Apriori algorithm is often used in traditional association rules mining,searching for the mode of higher frequency.Then the correlation rules are obtained by detected the correlation of the item sets,but this tends to ignore low-support high-correlation of association rules.In view of the above problems,some scholars put forward the positive correlation coefficient based on Phi correlation to avoid the embarrassment caused by Apriori algorithm.It can dig item sets with low-support but high-correlation.Although the algorithm has pruned the search space,it is not obvious that the performance of the running time based on the big data set is reduced,and the correlation pairs can be meaningless.This paper presents an improved mining algorithm with new association rules based on interestingness for correlation pairs,using an upper bound on interestingness of the supersets to prune the search space.It greatly reduces the running time,and filters the meaningless correlation pairs according to the constraints of the redundancy.Compared with the algorithm based on the Phi correlation coefficient,the new algorithm has been significantly improved in reducing the running time,the result has pruned the redundant correlation pairs.So it improves the mining efficiency and accuracy.