摘要
针对单一最小支持度挖掘关联规则不能反应不同数据项出现频度与性质的问题,提出了一个基于频繁模式树的多重支持度关联规则挖掘算法MSDMFIA(Multiple minimum Supports for Discover Maximum Fre-quent Item sets Algorithm),根据不同数据项的特点定义多重支持度,通过挖掘数据库中的最大频繁项目集,计算最大频繁候选项目集在数据库中的支持度来发现关联规则.该算法可以解决关联规则挖掘中经常出现的稀少数据项问题,并解决了传统的关联规则挖掘算法中的生成频繁候选集和多次扫描数据库的性能瓶颈.实验结果表明,本文提出的算法在功能和性能方面均优于已有算法.
Aiming at the problem that traditional methods with only one minsup can not completely reflect different appearing frequencies and natures of different data items, based on FP-Tree, a new algorithm is proposed called MSDMFIA (Multiple minimum Supports for Discover Maximum Frequent Item sets Algorithm). The algorithm allows users to specify multiple minsups to reflect various items natures. Through mining the maximum frequent item sets, calculating minsups of the maximum candidate frequent item sets, the association rules can be discovered. The algorithm resolves the bottlenecks in traditional algorithms, e. g. , the rare item problem, the frequent generation of candidate item sets and database scanning. Experimental results show that functionality and performance of the proposed algorithm is significantly improved compared with existing algorithms.
出处
《哈尔滨工业大学学报》
EI
CAS
CSCD
北大核心
2008年第9期1447-1451,共5页
Journal of Harbin Institute of Technology
基金
国家自然科学基金资助项目(60871042)
国家高技术研究发展计划资助项目(2003AA118010
2007AA01Z179)