摘要
频繁项集挖掘是数据挖掘过程中的重要部分,传统数据挖掘算法中常用Apriori算法和FP增长算法来挖掘频繁项集。在实际应用中,传统算法往往不能用于频繁更新的数据库,采用IMBT数据结构能从不断更新的数据库中挖掘频繁项集,但是这将导致存储空间不足和运行效率低下的问题。基于MapReduce的增量数据挖掘能够有效解决这些问题,通过对比基于MapReduce的增量数据挖掘和传统增量数据挖掘的运行时间可以证明,基于Mapeduce的增量数据挖掘更高效。
Frequent itemset mining is an important part of data mining. Apriori and FP-tree are often used to mine frequent itemsets in traditional data mining algorithms. In practical situation, the traditional algorithms often cannot be used in the database which updates frequently. IMBT data structure is used to mine frequent itemsets from a continuously updated database , but this will lead to lack of storage space and the low efficiency. Incremental data mining based on MapReduce can solve these problems , To compare the running time of incremental data mining based on MapReduce and traditional incremental data mining can demonstrate the incremental data mining based on MapReduce is more efficient.
出处
《微型机与应用》
2014年第1期67-70,共4页
Microcomputer & Its Applications