摘要
设计一种基于二进制数及项目的支持度分布的Apriori改进算法BF-Apriori。该算法通过分析项目的概率分布并对项目集中的项目按概率从大到小进行排序,经维度编码为二进制数后,降低事务数据库的读取开销和存储开销,同时采用切片运算和剪枝技术降低规则挖掘运算的时间复杂度。实验结果表明,BF-Apriori算法降低了50%左右的存储开销及400%以上的执行时间,能提高数据挖掘的存储效率和运算速度。
This paper designs an improved algorithm named BF-Apriori based on Binary and item support distribution.The algorithm analyses the probability distribution of the items,sorts them in descending order of the probability,and applies dimensions coding to reduce the cost of the database transactions to read and store overhead.While the slice operation and effective pruning scheme are used to reduce the time complexity of rule mining computing.Experimental results show BF-Apriori algorithm reduces about 50% of the storage and more than 400% of the execution time,it can improve the storage efficiency and computational speed in data mining.
出处
《计算机工程》
CAS
CSCD
北大核心
2011年第5期65-67,70,共4页
Computer Engineering
基金
浙江省科技计划基金资助项目(2009C31066
2008C21093)
关键词
项目支持度分布
行向量逆序转换
列向量的转换
切片运算
逆序编码
item support distribution
Reverse Transform on Row(RTR)
Transform on Column(TC)
slice operation
reverse coding