摘要
关联规则可在庞大的数据集中找出不同事务之间隐藏的关系,其中Apriori算法是关联规则分析中较为有效的办法。然而,Apriori算法产生候选项集的效率较低且扫描数据过于频繁,造成算法计算需要耗费较长时间。另外,初始定义的最小支持度与最小置信度也不足以过滤无用的关联规则。针对以上问题,利用概率理论与有效的参数设置,在原有Apriori算法基础上,提出一种基于概率事务压缩的关联规则改进算法。数值算例结果表明,新算法可在第二次迭代之后,大幅减少低效候选项集,从而提升经典Apriori算法效率。
Association rule mining is to find interesting hidden associations from huge number of initial data. Apriori algorithm is the effective way to find these association. Howerer,the original Apriori has its own drawbacks,such as low efficiency of candidate item sets and scanning data frequency. The minimum support and confidence are not enough to filter useless association rules. To recover the deficiencies,this paper proposed an improved algorithm based on marked transaction compression by probability and parameters. Experiments show that this algorithm has much better capability than the original Apriori algorithm. After the second iteration of the algorithm,the candidate sets are reduced 50%,it shows that the improved algorithm was more efficient than the original one.
作者
孙帅
刘子龙
SUN Shuai;LIU Zi-long(School of Optical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China)
出处
《软件导刊》
2019年第9期85-87,92,共4页
Software Guide
基金
国家自然科学基金项目(61074087)