摘要
提出一种基于局部效用质量值的上界剪枝新方法,引入伪投影技术避免真实地构造物理投影,基于二者提出改进的FHIMA-P算法.在提出的FHIMA-P算法中引入事务合并和投影事务合并技术,提出最终的FHIMA-MP算法,并在mushroom和accident数据集上进行实验.结果表明:FHIMA-P算法的运行时间相比FHIMA-ALL算法缩短,而FHIMA-MP算法则较前两者效率有非常大的提高;在不同参数下,mushroom和accident数据集中大量可合并事务(投影事务)数目也很好地证明了事务(投影事务)合并的有效性.
A new method that uses the upper bound of quality to prune the search space based on local utility quality is proposed,meanwhile,pseudo projection technique is introduced to avoid actually construct the physical projection,then based on these two points,an improved FHIMAP algorithm is proposed.By adding the transaction merging and projected transaction merging technique in FHIMAP algorithm,the final FHIMAMP algorithm is proposed.An experiment is conducted on mushroom and accident dataset,the result shows that the running time of FHIMAP algorithm is shorter than that of FHIMAALL algorithm,while the FHIMAMP algorithm improves significantly compared with the previous two algorithms′efficiency.Moreover,the huge number of transactions(projected transaction)that can be merged on mushroom and accident dataset in different papameters also prove the effectiveness of transaction(projected transaction)merging technique.
作者
张健
刘韶涛
ZHANG Jian;LIU Shaotao(College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China)
出处
《华侨大学学报(自然科学版)》
北大核心
2017年第6期880-885,共6页
Journal of Huaqiao University(Natural Science)
基金
福建省科技计划重大项目(2011H6016)
关键词
频繁项集
高效用项集
伪投影
事务合并
frequent itemsets
high utility itemsets
pseudo projection
transaction merging