Frequent Pattern mining plays an essential role in data mining. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especia...Frequent Pattern mining plays an essential role in data mining. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist prolific patterns and/or long patterns.In this study, we introduce a novel frequent pattern growth (FP-growth)method, which is efficient and scalable for mining both long and short frequent patterns without candidate generation. And build a new project frequent pattern growth (PFP-tree)algorithm on this study, which not only heirs all the advantages in the FP-growth method, but also avoids it's bottleneck in database size dependence. So increase algorithm's scalability efficiently.展开更多
Frequent pattern mining plays an essential role in many important data mining tasks. FP-growth is a veryefficient algorithm for frequent pattern mining. However, it still suffers from creating conditional FP-tree sepa...Frequent pattern mining plays an essential role in many important data mining tasks. FP-growth is a veryefficient algorithm for frequent pattern mining. However, it still suffers from creating conditional FP-tree separatelyand recursively during the mining process. In this paper, we propose a new algorithm, called Least-Item-First Pat-tern Growth (LIFPG), for mining frequent patterns. LIFPG mines frequent patterns directly in Trans-tree withoutusing any additional data structures. The key idea is that least items are always considered first when the current pat-tern growth. By this way, conditional sub-tree can be created directly in Trans-tree by adjusting node-links and re-counting counts of some nodes. Experiments show that, in comparison with FP-Growth, our algorithm is about fourtimes faster and saves half of memory; it also has good time and space scalability with the number of transactions,and has an excellent performance in dense dataset mining as well.展开更多
文摘Frequent Pattern mining plays an essential role in data mining. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist prolific patterns and/or long patterns.In this study, we introduce a novel frequent pattern growth (FP-growth)method, which is efficient and scalable for mining both long and short frequent patterns without candidate generation. And build a new project frequent pattern growth (PFP-tree)algorithm on this study, which not only heirs all the advantages in the FP-growth method, but also avoids it's bottleneck in database size dependence. So increase algorithm's scalability efficiently.
文摘Frequent pattern mining plays an essential role in many important data mining tasks. FP-growth is a veryefficient algorithm for frequent pattern mining. However, it still suffers from creating conditional FP-tree separatelyand recursively during the mining process. In this paper, we propose a new algorithm, called Least-Item-First Pat-tern Growth (LIFPG), for mining frequent patterns. LIFPG mines frequent patterns directly in Trans-tree withoutusing any additional data structures. The key idea is that least items are always considered first when the current pat-tern growth. By this way, conditional sub-tree can be created directly in Trans-tree by adjusting node-links and re-counting counts of some nodes. Experiments show that, in comparison with FP-Growth, our algorithm is about fourtimes faster and saves half of memory; it also has good time and space scalability with the number of transactions,and has an excellent performance in dense dataset mining as well.