期刊文献+

一种新的两阶段抽样算法 被引量:1

A New Two-Phase Sampling Algorithm
下载PDF
导出
摘要 两阶段抽样算法从海量数据集中抽取样本数据用于数据挖掘,当数据集规模过大时算法效率偏低,当数据集规模过大且为稀疏数据集时抽样精度偏低。本文改进了传统两阶段抽样算法,提出新的抽样算法EAFAST,可自适应地调节算法参数,而且能充分利用历史信息进行启发式搜索。实验证明,EAFAST算法可同时提高算法效率和抽样精度,弥补了传统算法的不足。 Traditional two-phase sampling algorithms extract the sample data used on data mining from a huge data set. The algorithm efficiency is lower when the data set is oversized, and the sample accuracy is lower when the data set is an oversized sparse one. By improving the traditional two-phase sampling algorithms, the paper presents a new sampling algorithm named EAFAST, which adjusts algorithm parameters adaptively and performs heuristic search using the historical information fully. Experiments demonstrate EAFAST can enhance the efficiency and sample accuracy simultaneously,and thus remedies the insufficiencies of traditional algorithms.
出处 《计算机工程与科学》 CSCD 2007年第7期64-66,70,共4页 Computer Engineering & Science
基金 湖北省自然科学基金资助项目(2006ABA082)
关键词 抽样 两阶段 频繁项目集 剪枝 精度 sample two-phase frequent item set trim accuracy
  • 相关文献

参考文献7

  • 1Evfimievski A,Srikant R,Agrawal R,et al.Privacy Preserving Mining of Association Rules[A].Proc of the 8th ACM SIGKDD Int'l Conf on Knowledge Discovery and Data Mining[C].2002.217-228.
  • 2张春阳,周继恩,钱权,蔡庆生.抽样在数据挖掘中的应用研究[J].计算机科学,2004,31(2):126-128. 被引量:11
  • 3Chen B,Haas P,Scheuermann P.A New Two-Phase Sampling Based Algorithms for Discovery Association Rules[A].Proc of the 8th ACM SIGKDD Int'l Conf on Knowledge Discovery and Data Mining[C].2002.462-468.
  • 4Zaki M J,Parthasarathy S,Lin W,et al.Evaluation of Sampling for Data Mining of Association Rules[A].Proc of the 7th Workshop on Research Issues in Data Engineer[C].1997.42-50.
  • 5Cai-YanJia,Xie-PingGao.Multi-Scaling Sampling: An Adaptive Sampling Method for Discovering Approximate Association Rules[J].Journal of Computer Science & Technology,2005,20(3):309-318. 被引量:2
  • 6Agrawal R,Srikant R.Fast Algorithm for Mining Association Rules[A].Proc of the 20th Int'l Conf on Very Large Data Bases[C].1994.487-499.
  • 7Watanabe O.Simple Sampling Techniques for Discovery Science[J].IEICE Trans on Information and Systems,2000,83(1):19-26.

二级参考文献3

共引文献11

同被引文献10

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部