摘要
数据挖掘中的关联分析技术旨在发现大量数据项集之间有趣的关联关系,其核心问题是寻找频繁项集。针对传统的基于矩阵的关联挖掘算法中矩阵规模和事务数据库大小相关,在处理超大型事务数据库时,仍会存在内存瓶颈的问题,提出了一个矩阵规模和事务数据库大小无关、通过矩阵约束预挖掘后验证的频繁项集发现算法。实验结果显示,该算法提高了频繁项集的挖掘速度。
Association analysis techniques in data mining are aimed at discovering interesting association among a large number of data itemsets,and the core problem is to find frequent itemsets.In traditional association mining algorithm based on matrix,the matrix size is related to the transaction database size.Thus memory bottlenecks still exist in dealing with very large transaction databases.This paper presents a frequent itemsets discovery algorithm to solve this problem.In the pre-mining and post-validating matrix constrained algorithm the matrix size is independent of the transaction database size.Experimental results show that this algorithm improves the speed of frequent itemsets mining.
出处
《计算机工程与应用》
CSCD
北大核心
2011年第21期133-136,共4页
Computer Engineering and Applications
基金
国家自然科学基金No.60873104
河南省科技攻关计划项目(No.092102210316)~~
关键词
数据挖掘
关联分析
频繁项集
data mining
association analysis
frequent itemsets