期刊文献+

限制树宽图上的有界聚类

Bounded Clustering on Bounded Tree-width Graphs
下载PDF
导出
摘要 有界聚类问题源于IBM研究院开发的一个分布式流处理系统,即S系统。问题的输入是一个点赋权和边赋权的无向图,并指定若干个称为终端的顶点。称顶点集合的一个子集为一个子类。子类中所有顶点的权和加上该子类边界上所有边的权和称为该子类的费用。有界聚类问题是要得到所有顶点的一个聚类,要求每个子类的费用不超过给定预算B,每个子类至多包含一个终端,并使得所有子类的总费用最小。对于限制树宽图上的有界聚类问题,给出了拟多项式时间精确算法。利用取整的技巧对该算法进行修正,可在多项式时间之内得到(1+ε)-近似解,其中每个子类的费用不超过(1+ε)B,ε是任意小的正数。如果进一步要求每个子类恰好包含一个终端,则所给算法可在多项式时间之内得到(1+ε)-近似解,其中每个子类的费用不超过(2+ε)B。 The bounded clustering problem is motivated by system S,a distributed stream processing system developed at IBM Research.Input to the problem is an undirected graph with vertex and edge weights and a subset of vertices called terminals.A cluster is a subset of the vertices.The cost of a cluster is defined as the total vertex weight in the cluster plus the total edge weight at the boundary of the cluster.The goal of the problem is to partition the vertices into clusters of cost at most a given budget B such that each cluster contains at most one terminal and the total cost of the clusters is minimized.For the problem on graphs of bounded tree-width,a pseudo-polynomial time exact algorithm was presented,which can be modified via rounding to yield a(1+ε)-approximation in polynomial time violating the given budget by a 1+ε factor,where ε0 can be made arbitrarily small.For a variant of the problem on graphs of bounded treewidth where each cluster contains exactly one terminal,a polynomial time algorithm was presented that yields a(1+ε)-approximation violating the budget by a 2+ε factor.
作者 李曙光 周彤
出处 《计算机科学》 CSCD 北大核心 2011年第11期241-244,共4页 Computer Science
基金 国家自然科学基金(60970105) 国家自然科学基金青年基金(11161035) 山东省信息产业发展专项资金项目(2008X00039) 山东省软科学研究计划(2010RKGA1057 2011RKGB5040)资助
关键词 近似算法 流处理 有界聚类 限制树宽图 动态规划 Approximation algorithms Stream processing Bounded clustering Bounded tree-width graphs Dynamic programming
  • 相关文献

参考文献10

  • 1Babeock B, Babu S, Datar M, et al. Models and issues in data stream systems[C3//Proceedings of the 21st ACM SIGMODSIGACT-SIGART Symposium on Principles of Database Systems(PODS'02). New York.. ACM Press,2002: 1-16.
  • 2金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 3Khandekar R, Hildrum K, Parekh S, et al. Bounded size graph clustering with applications to stream processing[C]//Procee dings of the Foundations of Software Technology and Theoreti cal Computer Science ( FSTTCS' 09 ). Kanpur, India, 2009 : 275 -286.
  • 4System S stream computing system[OL], http: //www-01, ibm, com/software/data/infosphere/streams.
  • 5Bodlaender H L, Koster A M C A. Combinatorial optimization on graphs of bounded treewidth[J]. The Computer Journal, 2008,51(3) :255-269.
  • 6Kleinberg J, Tardos E. Algorithm design[M]. Boston: Addison Wesley, 2005.
  • 7Li Shu-guang, Zhu Da-ming, Xin Xiao. Bounded size graph clus tering on trees[C]//Proceedings of 2010 International Forum on Information Technology and Applications (IFITA' 10). Kun ruing, China, 2010 :95-98.
  • 8Bodlaender H L. A linear time algorithm for finding tree-deeom positions of small treewidth[J].SIAM Journal on Computing, 1996,25(6):1305 -1317.
  • 9Kloks T. Treewidth: Computations and approximations [C]// Lecture Notes in Computer Science, vol. 842. Berlin: Springer Verlag, 1994.
  • 10ZHANG Liang,Byeong-Seob You,GE Jun-wei,LIU Zhao-hong,Hae-Young Bae.A graph-based sliding window multi-join over data stream[J].重庆邮电大学学报(自然科学版),2007,19(3):362-366. 被引量:1

二级参考文献62

  • 1[1]BABCOCK B,BABU S,DATAR M,et al.Models and Issues in Data Streams[C]//Proc.ACM SIGMOD-SIGACT-SIGART Symp.on Princ.of Database Sys.(PODS),[s.l.]:[s.n.]2002:1-6.
  • 2[2]VIGALAS S,NAUGHTON J,BURGER J,Maximizing the output rate of multi-way join queries over streaming information sources[C]//Proc.Int.Conf.on Very Large Data Bases (VLDB).San Fransisco:Morgan Kaufmann,2003:285-296.
  • 3[3]LEE Chiang,SHIH Chi-Sheng,CHEN Yaw-Huei.Optimizing large join queries using a graph-based approach[J].IEEE Transactions on Knowledge and Data Engineering,2001,13(2):298-315.
  • 4[4]WILSCHUT A N,P M G.Apers.Dataflow query execution in a parallel main-memory environment[J].Distributed and Parallel Databases,1993,1(1):103-128.
  • 5[5]WILSCHUT A N,APERS P M G.Pipelining in query execution[EB/OL].[2006-12-06].http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=77227.
  • 6[6]URHAN T,FRANKLIN M J.XJoin:A Reactively-scheduled Pipelined Join Operator[J].IEEE Data Engineering Bulletin,2000,23(2):27-33.
  • 7[7]GOLAB L,OZSU T.Processing sliding window multi-joins in continuous queries over data streams[EB/OL].[2006-12-28].http://se.uwaterloo.ca/~tozsu/courses/cs856/F05/Presentations/Week6/TaoWk6.pdf.
  • 8[8]RAMAN V,DESHPANDE A,HELLERSTEIN J M.Using state modules for adaptive query processing[EB/OL].[2006-12-06].http://db.cs.berkeley.edu/papers/icde03-stems.pdf.
  • 9[9]VIGLAS S,NANGHTON J F.Rate-based Query Optimization for Streaming Information Sources[C]//Proceedings of the ACM SIGMOD International Conference on Management of Data.Madison,Wisconsin:[s.n.],2002.
  • 10[10]KANG J,NAUGHTON J,VIGLAS S.Evaluating window joins over unbounded streams[EB/OL].(2006-12-19).http://www.cs.wisc.edu/niagara/papers/knv02-windowjoin.pdf.

共引文献160

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部