摘要
分析最大频繁项集和完全频繁项集的关系,提出了一个挖掘最大频繁项集的高效算法DFMFI M iner(The M iner Basedon D epth-F irst Search ing forM in ingMaximal Frequent Item sets),采用深度优先方法搜索项集空间,采用垂直位图及一定的压缩方法对表示事务数据库并进行约简,并采用多种有效剪枝策略和优化策略,提高了算法的效率。在多个数据集上进行了实验,实验结果表明该算法特别适于挖掘具有长频繁项集的数据集。
The relationship between maximal frequent itemsets and all frequent itemsets is discussed and an efficient algorithm DFMFI - Miner (The Miner Based on Depth - First Searching for Mining Maximal Frequent Itemsets) for mining maximal frequent itemsets is proposed. The algorithm uses the depth -first method to search in itemsets space and the vertical bitmap to represent and compress transaction database. It also uses some efficient pruning strategies to reduce the searching space and decrease the candidate itemsets in order to improve the efficiency. The algorithm is implemented in many datasets and the results of experiment show that the algorithm is especially effective for mining the datasets with long frequent itemsets.
出处
《计算机仿真》
CSCD
2006年第7期79-83,共5页
Computer Simulation
基金
国家基础研究发展基金(973计划
G1999032701)
江苏省自然科学基金(BK2002091)资助
关键词
数据挖掘
深度优先搜索
频繁项集
最大频繁项集
Data mining
Depth - first seaching
Frequent itemsets
Maximal frequent itemsets