摘要
频繁项集的挖掘效率是关联规则产生的关键.针对经典Apriori算法的瓶颈,提出一种改进算法,通过数组结构来保存项集信息,只须扫描一遍数据库减少了时间开销.在自连接前进行项目计数,减少参加连接的项集数量,减少了候选项集的数量.通过实例证明,改进算法的效率更高.
The efficiency of mining complete set of frequent items remains a key factor to determine association rules.As to the bottlenecks of the Apriori algorithm,an improved method was put forward.The array was used to store items information,it just need once scans to database and reduce a mass of time.The improved method added a pruning process before connection,the number of candidate items could be reduced.The example showed that the improved algorithm was more efficient.
出处
《哈尔滨商业大学学报(自然科学版)》
CAS
2011年第5期705-708,共4页
Journal of Harbin University of Commerce:Natural Sciences Edition
基金
黑龙江省教育厅项目(11541083)