摘要
针对相关算法在挖掘频繁闭项集时所存在的问题,提出了一种基于位运算的频繁闭项集挖掘算法。该算法首先将数据集转换成布尔矩阵,只需扫描数据集一次;通过位运算计算支持度,利用矩阵和数组存储辅助信息,减少时间和空间消耗;深度优先搜索产生频繁闭项集时利用剪枝策略进一步减少挖掘时间;利用同生项集性质进行闭合性检测,无须检查超集或子集。理论分析和实验结果验证了该算法的有效性。
Aiming at the problems of mining frequent closed itemsets, this paper proposed an algorithm based on bit operation for mining frequent closed itemsets (MFCIS). Firstly, the algorithm used the vector to express items in database and scaned the database for only one time. Secondly it computed the support of itemsets through the bit operation and used the matrice and the array to store the ancillary information to reduce the time and memeory, and used pruning technology to improve the mining efficiency during creating the frequent closed itemsets by depth-first search. Finally, it used the nature of syngenetic itemsets to test frequent closed itemsets so as not to test superset or subset. Theoretical analysis and experimental results show that the algorithm is efficient.
出处
《计算机应用研究》
CSCD
北大核心
2013年第11期3280-3282,3286,共4页
Application Research of Computers
基金
四川省科技厅资助项目(2011JY0141)
四川省教育厅资助项目(12ZB171)
关键词
数据挖掘
频繁闭项集
矩阵
位运算
同生项集
data mining frequent closed itemsets matrix bit operation syngenetic itemsets