摘要
目前已提出了许多快速的关联规则增量更新挖掘算法,但是它们在处理对新增事务敏感的问题时,往往会丢失一些重要规则。为此,文章提出了一种新的挖掘增量更新后的数据库中频繁项集的算法EUFIA(Entirety Update Frequent Itemsets Algorithm),该算法先对新增事务数据分区,然后快速扫描各分区,能全面有效地挖掘出其中的频繁项集,且不丢失重要规则。同时,最多只扫描1次原数据库也能获得更新后事务数据库的全局频繁项集。研究表明,该算法具有很好的可测量性。
Incremental Association rules Mining is an important content of data mining technology. This study proposes a new algorithm, called the Entirety Update Frequent Itemsets Algorithm (EUFIA) for efficiently incrementally mining association rules from large transaction database. Rather than rescanning the original database for some new generated frequent itemsets, EUFIA partitions the incremental database logically according to unit time interval, then accumulates the occurrence counts of new generated frequent itemsets and deletes infrequent itemsets obviously by backward method. Thus, EUFIA can discover newly generated frequent itemsets more efficiently and need rescan the original database only once to get overall frequent itemsets in the final database if necessary. EUFIA has good scalability in our simulation.
出处
《计算机科学》
CSCD
北大核心
2007年第2期220-222,233,共4页
Computer Science
基金
国家自然科学基金项目(50474033)
福建省自然科学基金项目(A0310008)
福建省高新技术研究开放计划重点项目(2003H043)
关键词
关联规则
增量式更新
强频繁项集
次频繁项集
弱频繁项集
Association rules, Incremental updating, Powerful frequent itemsets, Inferior frequent itemsets, Weak frequent itemsets