摘要
事务数据库中关联规则的发现是数据挖掘中一个非常重要的研究领域,关联规则的挖掘通常分为两个步骤,首先找出所有频繁项集,然后由频繁项集产生强关联规则。Apriori算法是查找频繁项集的基本算法,简单明了,易于实现,但存在一些不足。针对Apriori算法需要多次扫描事务数据库,并产生大量候选项集,导致算法效率较低的缺陷,设计了一种基于项集信息表的Apriori_T算法,以表的形式来记录项集信息,避免了重复扫描事务数据库,降低了系统的I/O开销,提高了查找频繁项集的效率。
Discovery of association rules in transaction database is a very important research area of data mining. The discovery of association rules includes two steps, finding out all frequent item sets at first, and then creating strong association rules by frequent item sets. Apriori algorithm is a basic algorithm of finding frequent item sets, which is simple and easy to implement. But apriori algorithm needs to scan the transaction database for several times, and creates a lot of candidate items, which leads to low efficiency. An algorithm named Aprioi_T based on item sets information table is proposed to improve the efficiency. Aprioi_T algorithm avoids scanning the transaction database repeatedly, and reducing the I/O spending of the system, overcomes the defects of Apriori algorithm.
出处
《微计算机信息》
2010年第21期131-133,共3页
Control & Automation