期刊文献+

一种分布式数据流有效数据识别方法研究

Study of a Discovery Method for Useful Data Items in Distributed Data Streams
下载PDF
导出
摘要 针对分布式数据流应用中,如何在高速、海量的输入数据中识别重要数据单元的问题,给出了有效数据的概念,并提出了1种有效数据识别算法。该算法以缩略图技术为基础,能够在用户给定的误差范围内,以接近1的概率输出有效数据,而且占用较少内存。实验和算法分析验证了算法的有效性。 How to discover important items is one of the key technologies concerning distributed data stream applications with infinite data and high speed. To solve this problem, the concept of effective data is defined and a discovery algorithm is proposed. Based on the data sketch method, the effective data can be output with error given by the user with probability near to one and little memory is consumed. The simulation and algorithm analysis proved the efficiency of this algorithm.
出处 《中国海洋大学学报(自然科学版)》 CAS CSCD 北大核心 2006年第6期885-888,1012,共5页 Periodical of Ocean University of China
基金 国防重大基础预研项目(S0500A001)资助
关键词 数据流 分布式数据流系统 频繁数据 有效数据 data stream distributed data stream manage system frequent data items useful data items
  • 相关文献

参考文献8

  • 1Muthukrishnan S.Data streams:Algorithms and applications[C].Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms table of contents.Philadelphia,PA,USA:Publisher Society for Industrial and Applied Mathematics,2003:413-413.
  • 2Phillip B Gibbons,Srikanta Tirthapura.Distributed streams algorithms for sliding windows[C].Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures.USA:Publisher ACM Press,2002:63-72.
  • 3Lin Xuemin,Lu Hongjun,Xu Jian,et al.Continuously maintaining quantile summaries of the most recent N elements over a data stream[C].Proceedings of the 20th International Conference on Data Engineering.Washington,DC,USA:Publisher IEEE Computer Society,2004:362-373
  • 4Brian Babcock,Surajit Chaudhuri,Gautam Das.Dynamic sample selection for approximate query processing[C].[s.1.]:SIGMOD Conference,2003:539-550.
  • 5Manku G,Motwani R.Approximate frequency counts over data streams[C].[s.1.]:Proceedings of the 28th International Conference on Very Large Data Bases,2002:346-357.
  • 6Karp R,Papadimitriou C,Shenker S.A simple algorithm for finding frequent elements in sets and bags[J].ACM Transactions on Database Systems (TODS),2003,28(1):51-55.
  • 7Graham Cormode,Muthukrishnan S.An improved data stream summary:Thecount-min sketch and its applications[J].Lecture Notes in Computer Science,2004,2976:29-38.
  • 8Moses Charikar,Kevin Chen,Martin Farach-Colton.Finding frequent items in data streams[J].Theoretical Computer Science,2004,312(1):3-15.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部