摘要
数据挖掘是一个多学科交叉融合而形成的新兴的学科,它利用各种分析工具在海量数据中发现模型和数据间的关系。而在大规模事务数据库中,挖掘关联规则是数据挖掘领域的一个非常重要的研究课题。文中介绍了关联规则挖掘的研究情况,描述了经典Apriori算法的实现,并对该算法进行了分析和评价,指出了其不足和原因。描述了FP树挖掘最大频繁项集的算法,通过实例对该算法进行了性能评估,并得到结论:数据库中潜在的最大频繁模式越多,运行时间越长。
Data mining is an emerging subject that composed and amalgamated by multiple subjects. It is an analytic process designed to explore data in search of consistent patterns and/or systematic relationships between variables. Mining association rules in business transaction datahases is one of the important topic of research on data mining. This paper introduced the research complexion of the association rules mining algorithm, describes the classical Apfiori algorithm,analyses and evaluates it. The author emphasizes FP tree mining maximum frequent item sets algorithm specially. And evaluates performace of the algorithm through instance. At the end, the paper gives the conclusion:the more maximum frequent item pattern in the database, the longer run time is needed.
出处
《计算机技术与发展》
2006年第5期21-25,共5页
Computer Technology and Development
关键词
数据挖掘
关联规则
频繁项集
FP树
data mining
association rules
frequent item sets
FP tree