期刊文献+

数据流历史数据的存储与聚集查询处理算法 被引量:17

Algorithms for Storing and Aggregating Historical Streaming Data
下载PDF
导出
摘要 目前数据流的研究成果主要集中在分析处理存储于内存中的最近一段时间内的数据流数据,忽略了对数据流历史数据的分析处理与存储管理.提出了一种数据流历史数据的存储管理及聚集查询处理方法,通过对历史数据实施多层递阶抽样存储,并在内存中建立存储数据流历史数据聚集值的HDS-Tree索引,实现对无限数据流历史数据的存储管理,有效地支持各种聚集查询.同时,还给出了基于HDS-Tree的聚集查询算法的时间复杂性分析和查询误差分析.理论分析与实验结果表明,该方法可以有效地用于数据流历史数据的存储与分析.目前数据流的研究成果主要集中在分析处理存储于内存中的最近一段时间内的数据流数据,忽略了对数据流历史数据的分析处理与存储管理.提出了一种数据流历史数据的存储管理及聚集查询处理方法,通过对历史数据实施多层递阶抽样存储,并在内存中建立存储数据流历史数据聚集值的HDS-Tree索引,实现对无限数据流历史数据的存储管理,有效地支持各种聚集查询.同时,还给出了基于HDS-Tree的聚集查询算法的时间复杂性分析和查询误差分析.理论分析与实验结果表明,该方法可以有效地用于数据流历史数据的存储与分析. The current research work over data streams is mainly focused on dealing with the arrival of recent data in memory, neglecting the analysis and management of historical streaming data. An approach is proposed to store and query historical streaming data by using multi-layer recursive sampling method and HDS-Tree structure, which indexes the aggregation of historical streaming data and supports all kinds of aggregation queries over historical streaming data. The time-complexity and the error of aggregation algorithms are also analyzed based on HDS-Tree. The analytical and experimental results show that the approach can be effectively used to store and analyze the historical streaming data.
出处 《软件学报》 EI CSCD 北大核心 2005年第12期2089-2098,共10页 Journal of Software
基金 国家自然科学基金 国家高技术研究发展计划(863) 国家重点基础研究发展规划(973) 黑龙江省自然科学基金~~
关键词 数据流 历史数据 聚集算法 HDS—Tree data streams historical data aggregation algorithm HDS-Tree
  • 相关文献

参考文献12

  • 1Babcock AK, Babu S, Datar M. Model and issues in data stream systems. In: Popa L, ed. Proc. of the 21st ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems Madison: ACM, 2002. 1-16.
  • 2Golab L, Ozsu MT. Issues in data stream management. SIGMOD Record, 2003,32(2)5-14.
  • 3Araru A, Babu S, Widom J. An abstract semantics and concrete language for continuous queries over streams and relations.Technical Report, Stanford University Database Group, 2002.Available at http://dbpubs.stanford.edu/pub/2002-57
  • 4Guha S, Koudas N. Approximating a data stream for querying and estimation: Algorithms and performance evaluation. In: Stefano C, Christoph F, Pat S, eds. Proc. of the 18th Int'l Conf. on Data Engineering San Jose: IEEE Computer Society, 2002. 567-576.
  • 5Madden S, Shah M, Hellerstein JM, Raman V. Continuously adaptive continuous queries over streams. In: Franklin MJ, Moon B,Ailamaki A, eds. Proc. of the 2002 ACM SIGMOD Int'l Conf. on Management of Data Madison: ACM, 2002.49-60.
  • 6Gehrke J, Korn F, Srivastava D. On computing correlated aggregates over continual data streams. In: Afef WG, ed. Proc. of the2001 ACM SIGMOD Int'l Conf. on Management of Data Santa Barbara: ACM, 2001. 13-24.
  • 7Dobra A, Gehrke J, Garofalakis M, Rastogi R. Processing complex aggregate queries over data streams. In: Franklin MJ, Moon B,Ailamaki A, eds. Proc. of the 2002 ACM SIGMOD Int'l Conf. on Management of Data Madison: ACM, 2002. 61-72.
  • 8Chen Y, Dong G, Han J, Wah BW, Wang J. Multi-Dimensional regression analysis of time-series data streams. In: Bernstein PA,Loannidis YE, Ramakrishnan R, eds. Proc. of the 28th Int'l Conf. on Very Large Data Bases Hong Kong: Morgan Kaufmann Publishers, 2002. 323-334.
  • 9Zhang D, Gunopulos D, Tsotras V J, Seeger B. Temporal aggregation over data streams using multiple granularities. In: Jensen CS,Jeffery KG, eds. Proc. of the 8th Int'l Conf. on Extending Database Technology LNCS, 2002. 646-663.
  • 10Olken F. Random Sampling from Databases [Ph.D. Thesis]. Berkeley, University of California, 1993.

同被引文献169

引证文献17

二级引证文献63

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部