期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Multi-Scaling Sampling: An Adaptive Sampling Method for Discovering Approximate Association Rules 被引量:2
1
作者 Cai-YanJia Xie-PingGao 《Journal of Computer Science & Technology》 SCIE EI CSCD 2005年第3期309-318,共10页
One of the obstacles of the efficient association rule mining is theexplosive expansion of data sets since it is costly or impossible to scan large databases, esp., formultiple times. A popular solution to improve the... One of the obstacles of the efficient association rule mining is theexplosive expansion of data sets since it is costly or impossible to scan large databases, esp., formultiple times. A popular solution to improve the speed and scalability of the association rulemining is to do the algorithm on a random sample instead of the entire database. But how toeffectively define and efficiently estimate the degree of error with respect to the outcome of thealgorithm, and how to determine the sample size needed are entangling researches until now. In thispaper, an effective and efficient algorithm is given based on the PAC (Probably Approximate Correct)learning theory to measure and estimate sample error. Then, a new adaptive, on-line, fast samplingstrategy - multi-scaling sampling - is presented inspired by MRA (Multi-Resolution Analysis) andShannon sampling theorem, for quickly obtaining acceptably approximate association rules atappropriate sample size. Both theoretical analysis and empirical study have showed that the Samplingstrategy can achieve a very good speed-accuracy trade-off. 展开更多
关键词 data mining association rule frequent itemset sample error multi-scalingsampling
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部