期刊文献+

一种高效的离线数据流频繁模式挖掘算法 被引量:2

Efficient Algorithm for Mining Frequent Patterns over Offline Data Streams
下载PDF
导出
摘要 数据流频繁模式挖掘是当前数据挖掘领域中的研究热点之一,数据流连续性、无序性、无界性及实时性的特点为挖掘算法在时间及空间性能方面提出了更高的要求。数据流中模式频度的震荡现象,迫使现有算法对概要数据结构频繁维护,致使其时间、空间效率均受到较大影响。构造了具备较高空间性能的概要数据结构SP-tree,同时定义了震荡性因子χ以量化震荡信息,提出了一种高效的离线数据流频繁模式挖掘算法SPDS,有效降低了数据震荡对算法性能的影响;在处理新到数据集时,算法采取分而治之的分离映射策略,进一步提升了时间效率;同时在查询结果方面提高了部分模式的计数精度。 Mining frequent patterns from data streams is one of the hottest research topics in data mining nowadays. The features of data streams, such as consecution, disorder and real-time, raise requirements for higher time and space performance of mining algorithms. Vibration of pattern frequency in data streams, compels the present algorithms to revise the synopsis structure continually,and leads up to disadvantage impact on both time and space efficiency. A more scalable synopsis structure SP-tree was designed firstly, meanwhile the concept of vibration factor 3( was given for maintaining vibrational information. Then an efficient algorithm for mining frequent patterns over offline data streams SPDS was proposed, which relieves the performance from the impact of vibration effectively, and increases the count accuracy of partial patterns. This algorithm adopts a divide-and-conquer mechanism to mine the current dataset, thereby improves itself further.
出处 《计算机科学》 CSCD 北大核心 2009年第7期247-251,291,共6页 Computer Science
基金 国家自然科学基金项目(60675030)资助
关键词 数据挖掘 数据流 频繁模式 震荡性因子 Data ming, Data stream, Frequent pattern(FP), Vibration factor
  • 相关文献

参考文献3

二级参考文献8

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2Giannella C, Han JW, Pei J, Yan XF, Yu PS. Mining frequent patterns in data streams at multiple time granularities.http://maids.ncsa.uiuc.edu/documents/readings/fpstm03.pdf
  • 3Manku GS, Motwani R. Approximate frequency counts over data streams. In: Bernstein P, Ioannidis Y, Ramakrishnan R, eds. Proc.of the 28th Int'l Conf. on Very Large Data Bases. Hong Kong: Morgan Kaufmann Publishers, 2002. 346-357.
  • 4Hidber C. Online association rule mining. In: Delis A, Faloutsos C, Ghandeharizadeh S, eds. Proc. of the ACM SIGMOD Int'l Conf.on Management of Data (SIGMOD 1999). Philadelphia: ACM Press, 1999. 145-156.
  • 5Chang J, Lee W. Finding recent frequent itemsets adaptively over online data streams. In: Lise G, Ted E. S, Pedro D, Christos F,eds. Proc. of the 9th ACM SIGKDD Int'l Conf. on Knowledge Discovery & Data Mining. Washington: ACM Press, 2003.226-235.
  • 6Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Beeri C, et al., eds. Proc. of the 20th Int'l Conf. on Very Large Databases. Santiago: Morgan Kaufmann Publishers, 1994. 487-499.
  • 7Agarwal RC, Aggarwal CC, Prasad VVV. A tree projection algorithm for finding frequent itemsets. Journal on Parallel and Distributed Computing, 2001,61(3):350-371.
  • 8.[EB/OL].http://www.almaden.ibm.com/so ftware/quest/Resources/index. shtml,.

共引文献40

同被引文献6

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部