期刊文献+

一种面向数据流离线分析的多策略并行查询中间件 被引量:1

A Multi-policy Data Stream Off-line Analysis for Parallel Query Middleware
下载PDF
导出
摘要 数据流成为日益重要的数据密集型应用.离线分析处理是对数据流产生的海量日志数据进行随意的统计查询,单个查询处理的数据量在上百GB,及时的响应时间和扩展性对传统数据库提出巨大挑战.本文以网络监控为背景,分析了离线分析处理的应用特征,提出了一种无共享的并行查询中间件,利用多策略及DBMS实现局部结果的汇总,通过具体的执行过程,分析了不同类型查询的扩展性. Data stream is becoming an important data-intensive application. Off-line analysis process optionally querys massive logged data that data stream produces, and the single querying may process hundreds of GB data. Its timeliness and scalability present great challenge to traditional database. In this paper, the applied characteristics of data stream off-line analysis are analysed based on network monitoring management, a shared-nothing parallel database middleware is presented,and the summarizing of local result is realized by using DBMS and multi-policy. Finally, the scalabilities of different kinds'of query are analysed by specific execution process.
出处 《兰州交通大学学报》 CAS 2008年第4期102-105,共4页 Journal of Lanzhou Jiaotong University
关键词 数据流离线分析 无共享 并行查询中间件 可扩展性 data stream off-line analysis shared-nothing parallel query middleware scalabilities
  • 相关文献

参考文献6

  • 1BABCOCK B, BABUS, DATAR M, et al. Models and issues in data stream systems[C]//Proc, of the 2002 ACM Syrup. on Prineiples of Database Systems, 2002:1 -16.
  • 2MARTA M. A Middleware for Executing OLAP Queries in Parallel[J]. Technical Report ES-690,2005 (5) : 56-60.
  • 3袁志坚,杨树强,贾焰.基于CORBA的并行海量查询中间件的设计及实现[C]//第20届全国数据库学术会议论文集.长沙:国防科学技术大学,2003.
  • 4IAhn. Database issues in telecommunications network management[J]. ACM SIGMOD Record, 1994,23 (2): 37-43.
  • 5DAVID J D,JIM G. Parallel database systems: the future of high performance database processing[J]. Communication of ACM, 1992, 36 (6): 417-434.
  • 6王勇,焦丽梅.面向网络数据管理的并行查询处理[J].计算机工程与应用,2007,43(30):5-10. 被引量:2

二级参考文献16

  • 1Dewitt D J,Gray J.Parallel database systems:the future of high performance database processing[J].Communication of ACM,1992,36 (6):417-434.
  • 2Kossmann D.The state of the art in distributed query processing[J]. ACM Computing Surveys,2000,32(4):422-469.
  • 3Exbrayat M,Brunie L.A PC-NOW based parallel extension for a sequential DBMS[C]//Proc of IPDPS Workshops on PC-NOW,Cancun, Mexico, 2000.
  • 4Tamura T, Oguchi M, Kitsuregawa M,Parallel database processing on a 100 node PC cluster:cases for decision support query processing and data mining[C]//Proc of SuperComputing'97,November 1997.
  • 5Mattoso M.ParGRES:a middleware for executing OLAP queries in parallel, ES-690[R], 2005.
  • 6Shatdal A,Naughton J F.Adaptive parallel aggregation algorithms[C]// Proc of the SIGMOD, 1995:104-114.
  • 7Furtado P.Large relations in node-partitioned data warehouses[C]// Proc of DASFAA,2005:555-560.
  • 8袁志坚,杨树强,贾焰.基于CORBA的并行海量查询中间件的设计及实现[C]//第20届全国数据库学术会议论文集.长沙:国防科学技术大学,2003.
  • 9Pirahesh H,Hellerstein G,Hasan W.Extensible/rule based query rewrite optimization in starburst[C]//Proc of SIGMOD, 1992.
  • 10Gupta A,Harinarayan V,Quass D.Aggregate query processing in data warehousing environments[C]//Proc of VLDB, 1995.

共引文献3

同被引文献7

  • 1Gustavo E A, 13atista P A, Ronaldo C, et al. A study of the behavior of several methods for balancing machine learning training Data[J]. SGKDD Explorations, 2004, 6(1):20-29.
  • 2Crislianini N,Shawe-TaylorJ.支持向量机导论[M].北京:机械工业出版社,2005.
  • 3Rehan A, Stephen K, Nathalie J. Applying support vector machines to imbalanced datasets[C]//ECMI. 2004, LNAI, 3204 : 39-50.
  • 4Chawla N V, Japkowicz N, Kotcz A. Special issue on learning from imbalanced data sets[J]. Applied Mathematics and Computation, 2004,6( 1 ) : 1-6.
  • 5Cem K, Hulya, Cingi R Estimators for the population variance in simple and stratified random sampling[J]. Applied Mathematics and Computation,2006(4) : 1047- 1059.
  • 6曹苏群,王士同,陈晓峰.基于后验概率的不平衡数据集特征选择算法[J].计算机工程,2008,34(19):1-3. 被引量:5
  • 7杨智明,彭宇,彭喜元.基于支持向量机的不平衡数据集分类方法研究[J].仪器仪表学报,2009,30(5):1094-1099. 被引量:16

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部