期刊文献+

分布式数据流中挖掘频繁项算法的研究 被引量:2

Mining Frequent Item in Distributed Data Stream
下载PDF
导出
摘要 在数据流挖掘领域中,频繁项集的挖掘是基础性的,也是比较关键的问题,但是现在的算法大多都是基于在单数据流中挖掘频繁项集,传统在单数据流上挖掘频繁项集的算法有Apriori算法,由于挖掘多个数据流上的频繁项集存在数据和模式冗余问题,对算法的时间和空间效率都具有很大的挑战性。本文基于Apriori算法和多线程并发技术的思想改进了Apriori算法生成在分布式数据流上挖掘频繁项算法A-Apriori,它采用逐层迭代和并发技术来解决多个数据流同时到来频繁项的挖掘问题。实验表明,该算法在保证挖掘精度的前提下,可以比其它在分布式数据流中挖掘频繁项的算法获得更好的效率。 In the field of stream data mining,the mining of frequent item sets is a fundamental and pivotal problem.However,the algorithms nowadays mostly aim at the mining of frequent item in a single data stream.Apriori algorithm conventionally solves the mining of frequent item in a single data stream.As there exists the redundancy of data and pattern in the mining of frequent item sets in to be deleted multiple data streams,it challenges the temporal efficiency and the spacial efficiency of the algorithm.Based on the improvement of Apriori algorithm and Concurrent multi -threading technology this paper achieves the A -Apriori Algorithm that can mine the frequent item in to be deleted distributed data streams.It adopts Iterative method and concurrent programming to solve the problem when multiple data streams concur.Experimental results is given to show that the proposed algorithm can mine the frequent item in to be deleted distributed data streams more efficiently and ensure the accuracy at the same time.
作者 肖颖 毛国君
出处 《微计算机信息》 2010年第30期144-145,164,共3页 Control & Automation
基金 基金申请人:毛国君 项目名称:分布式数据流的集成模式挖掘模型和概念漂移检测算法研究 基金颁发部门:国家自然科学基金委(60496322)
关键词 分布式数据流 频繁项 多线程并发技术 distributed data stream Frequent item Concurrent multi-threading technology
  • 相关文献

参考文献1

二级参考文献9

  • 1王德兴,胡学钢,刘晓平,王浩.改进购物篮分析的关联规则挖掘算法[J].重庆大学学报(自然科学版),2006,29(4):105-107. 被引量:12
  • 2邱桃荣,白小明,张丽萍.基于粒计算的Apriori算法及其在图书管理系统中的应用[J].微计算机信息,2006,22(07X):218-221. 被引量:9
  • 3邓纳姆.数据挖掘教程[M].郭崇慧,田凤占,靳晓明,译.北京:清华大学出版社,2005.
  • 4Agrawal R,Imielinski T, Swami A. Mining Association Rules between Sets of items in Large database. In:Bunemuu P,Jajodia S eds. Proceedings of the 1993 ACM SIGMOD Conference on Management of Data. New York, NY:ACM Press, 1993.pp.207-216.
  • 5Agrawal R,Skikant R. Fast algorithms for mining association rules in large database [C].In Proceeding of the 20th International Conference on Very Large DataBases,Santiago,Chile,1994.pp.487- 499.
  • 6P. Anita and Van den P. Dirk, Constrained optimization of datamining problems to improve model performance: A direct-marketing application. Expert Systems With Applications29(3)(2005), pp.630 - 640.
  • 7Liao, Shu-Hsien Michael, Chen, Chyuan-Meei,Wu, Chung- Hsin, Mining customer knowledge for product line and brand extension in retailing. Expert Systems with Applications34 (3)(2008),pp. 1763-1776.
  • 8王瑜,刘连臣,吴澄.面向Web关联规则挖掘的快速Apriori算法[J].微计算机信息,2008,24(15):109-111. 被引量:8
  • 9黄进,尹治本.关联规则挖掘的Apriori算法的改进[J].电子科技大学学报,2003,32(1):76-79. 被引量:51

共引文献5

同被引文献20

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部