期刊文献+

多数据流上的连续分布式Top-k监测

Continuous Distributed Top-k Monitoring over Data Streams
下载PDF
导出
摘要 近年来,分布式系统中的数据流监测是一个十分活跃的领域。研究了如何实现通用并且高效的分布式top-k监测,即在分布的多数据流中根据用户给定的排序函数连续监测最大的k个值。在实际应用中,用户给定的排序函数可能是任意的排序函数,然而,目前的分布式top-k监测技术只支持加法作为排序函数。提出了一种通用的支持任意的连续的严格单调的聚集函数的分布式top-k监测算法GMR。GMR的通讯代价和k无关。通过真实世界数据和模拟数据验证了GMR的效率。实验表明,GMR的网络通讯量比同类方法低一个数量级以上。 Monitoring data streams in a distributed system is the focus of much research in recent years. This paper addresses the generic and efficient processing of distributed top-k monitoring, which is continuously reporting the k largest values according to a user-speclfied ranking function over distributed multiple data streams. In practice, the user-specified ranking function would be arbitrary ranking function. Unfortunately, state-of-art distributed top-k monitoring approaches only support the sum function as the ranking function. In this paper, we present a general algorithm GMR for distributed top-k monitoring, which supports arbitrary continuous and strict monotone aggregation functions. The communication cost of GMR is independent of k. We verify the effectiveness of GMR empirically using both real-world and synthetic data sets. We show that GMR reduces overall communication cost by an order of magnitude compared with alternatives.
出处 《计算机科学》 CSCD 北大核心 2007年第2期125-128,共4页 Computer Science
基金 国家"九七三"重点基础研究发展规划基金项目(2005CB321804) 国家"八六三"高技术研究发展计划基金项目(2004A112020) 国家"八六三"高技术研究发展计划基金项目(2005AA112030)
关键词 GMR 分布式 TOP-K 数据流 监测 GMR, Distributed, Top-k, Data streams, Monitoring
  • 相关文献

参考文献22

  • 1Arlitt M, Jin T. 1998 world cup web site access logs. August 1998. Available at:http://www. acre. org/sigcomm/ITA/
  • 2Bruno N, Gravano L,Marian A. Evaluating top-k queries over web-aceessible databases. In: ICDE, 2002
  • 3Balke W-T, Nejdl W, Siberski W, et al. Progressive Distributed Top-k Retrieval in Peer-to-Peer Networks. In: ICDE, 2005
  • 4Babcock B, Olston C. Distributed Top-K Monitoring. In: SIG-MOD, 2003
  • 5Babcock B, Olston C. Distributed top-k monitoring: [Technical report]. Stanford University Computer Science Department, 2002.http://dbpubs.stanford.edu/pub/2002-61
  • 6Carney D,Cetinternel U,Chernlack M, et al. Monitoring streamsa new class of data management applications. In:VLDB, 2002
  • 7Chen J,DeWitt D J,Tian F, et al. NiagaraCQ: A scalable continuous query system for internet databases. In:SIGMOD, 2000
  • 8Chang K C-C, Hwang S-W. Minimal probing: supporting expensive predicates for top-k queries. In:SIGMQD, 2002
  • 9Carey M J,Kossmann D. On saying "Enough already!" in SQL, In SIGMOD, 1997
  • 10Cao P, Wang Z. Efficient top-k query calculation in distributed networks. In: PODC, 2004

二级参考文献18

  • 1Napster Home Page. http://www.napster.com/
  • 2Gnutella Home Page. http://www.gnutella.com/
  • 3Stoica I, Morris R, Karger D, Kaashoek MF, Balakrishnan H. Chord: A scalable peer-to-peer lookup service for internet applications. ACM SIGCOMM Computer Communication Review, 2001,31 (4): 149-160.
  • 4Baeza-Yates R, Ribeiro-Neto B. Modem Information Retrieval. Boston: Addison Wesley, 1999.27-30.
  • 5Palmer CR, Steffan JG. Generating network topologies that obey power law. In: Proc. of the GLOBECOM. San Francisco: IEEE,2000. 434-438. http://citeseer. ist.psu.edu/palmer00generating.html
  • 6Ripeanu M. Peer-to-Peer architecture case study: Gnutella network. Technical Report, TR-2001-26, University of Chicago, 2001.
  • 7Buckley C. Implementation of the SMART information retrieval system. Technical Report, TR35-686, Cornell University, 1985.
  • 8Yu C, Philip G, Meng WY. Distributed top-n query processing with possibly uncooperative local systems. In: Freytag JC,Lockemann PC, Abiteboul S, Carey MJ, Selinger PG, Heuer A, eds. Proc. of the 29th Int'l Conf. on Very Large Data Bases. Berlin:Morgan Kaufmann Publishers, 2003.117-128.
  • 9Gravano L, Garcia-Molina H, Tomasic A. The effectiveness of GLOSS for the text database discovery problem. In: Snodgrass RT,Winslett M, eds. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 1994. 126-137.
  • 10Callan JP, Lu ZH, Croft WB. Searching distributed collections with inference networks. In: Proc. of the 18th Annual Int'l ACM SIGIR Conf. on Research and Development in Information Retrieval. Seattle: ACM Press, 1995.21-28.

共引文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部