摘要
FP-growth算法是关联规则挖掘中应用最为广泛的挖掘算法,与经典算法Apriori算法最大的区别是不需要挖掘候选集,所以在挖掘效率上有了很大的提升,但是在构建模式树FP-tree时是基于整个事务数据库的,当遇到大型数据库或挖掘约束条件严格时,算法执行过程中占用内存较大,对空间要求较高,且是递归调用,执行效率不高。在对FP-growth算法研究的基础上提出了一种改进算法,该算法改变FP-tree结构,将一棵FP-tree分为多条子树进行频繁模式的挖掘,减少了内存的占用,提高了算法的执行效率。
The FP-growth algorithm is the most widely used algorithm in association rule mining. Compared with classical algorithm Apriori algorithm, the foremost difference is that it has no need of candidate set, so the mining efficiency is greatly improved. But its model tree FP-tree is consmtcted based on the whole transaction database, so the algorithm occupies larger memory and more space in large database or strict mining constraints, and at the same time it is a recursive call, so the execution efficiency is very low. This paper puts forward a kind of improvement algorithm based on research of FP-growth algorithm. This algorithm changes FP-tree's structure, dividing FP-tree into multi subtrees for nfining frequent patterns, reducing the usage of memory, so it improves the efficiency and makes the data mining more easily.
出处
《计算机与网络》
2016年第24期58-61,共4页
Computer & Network