摘要
基于FP树的FP-Growth算法在挖掘频繁模式过程中需要两次扫描事务集来建立FP树,这不仅降低了算法的效率,而且给数据库服务器带来负担。在原有经典FP-Growth算法的基础上,提出一种基于二维表的方法对原算法进行改进,改进算法通过使用二维向量记录频繁度仅需遍历一次事务集,从而省略FP-Growth算法在生成新条件FP树时对条件模式基的第一次遍历,大大缩短了建立FP树的时间。实验结果表明,该算法的改进优于经典算法。
The FP-Growth algorithm based on FP-Tree needs to set scanning twice transaction set to create FP tree in the process of mining frequent patters,which not only reduces the efficiency of the algorithm,but also brings a burden to the database server.A new algorithm based on two-dimensional table is presented which scans at most one for the transaction set to improve the original algorithm,greatly re-ducing the time for the establishment of FP tree.Experimental results show that the algorithm is superior to the classical algorithm.
出处
《计算机工程与设计》
CSCD
北大核心
2010年第7期1506-1509,共4页
Computer Engineering and Design
关键词
数据挖掘
关联规则
频繁模式
频繁项集
FP树
data mining
association rule
frequent patterns
frequent item set
FP tree