摘要
频繁模式挖掘的研究最近致力于在一个合理的容错范围内寻找有代表性的模式来压缩庞大的挖掘结果集.一种新型启发式算法AMSA(Approximating Mining based Simulated Annealing)被提出,其采用了模拟退火思想来保证有效性和压缩的质量.依据FIMI(Frequent Itemset Mining Implementations Repository)提供的公用数据集进行的实验结果也证明了这一结论.通过与FPclose算法和RPglobal算法分别进行了性能的比较,AMSA挖掘的结果集规模小于FPclose算法和RPglobal算法得到的结果集规模,特别是当支持度阈值很低时,RP-global不可在合理时间内产生结果集,AMSA却可在合理时间内得出较精准的结果集.
Researches of frequent-pattern mining have recently focused on discovering representative patterns to compress a large of results within a reasonable tolerance bound. A novel heuristic algorithm, approxi- mating mining based simulated annealing (AMSA) , was proposed. The algorithm uses a method based simu- lated-annealing to improve efficiency and quality of the compression. Our experimental studies demonstrate the algorithm is efficient and high quality on a common dataset supported by frequent itemset mining implementations repository (FIMI). The mining result of AMSA is smaller than mining results of FPclose and RPglobal by performance study. Especially, if min_sup threshold is low, RPglobal fails to generate any result within reasonable time range, while AMSA generates a concise and succinct mining result.
出处
《北京航空航天大学学报》
EI
CAS
CSCD
北大核心
2009年第5期640-643,共4页
Journal of Beijing University of Aeronautics and Astronautics
基金
国家973计划资助项目(2005CB321902)
关键词
数据挖掘
模拟退火
启发式方法
data mining
simulated annealing
heuristic method