期刊文献+

在Chebyshev多项式概要上近似属性分布 被引量:1

Selectivity Estimation on Synopses Based on Chebyshev Polynomials
下载PDF
导出
摘要 在属性分布估计中,给定关系属性的类型为N个,使用B个(BN)数值近似其频度分布.基于直方图和小波的概要数据结构得到深入的研究,然而Chebyshev多项式也适合于近似算法.首先构造基于Chebyshev多项式的概要,再在其上估计原始属性分布,和以前的方法相比,算法的优势在于:L1、L2、L∞等误差度量下更高的精度;构造概要的时间复杂度只有O(NB);更易于动态维护.Chebyshev概要的有效性在模拟数据序列和实际数据序列上得到验证. Suppose we are given N values of an attribute in a relation. B (B〈〈N) numerical values are used to approximate frequency distributions of attribute. To date, two major high-accuracy synopses construction techniques have been proposed in the database literature, namely histograms and Haar wavelets. Still, these approximation techniques have not been previously compared to Chebyshev polynomials. In this paper, we conduct such a comparison the accuracy achieved with synopses based on Chebyshev polynomials to that of other techniques, when estimating frequency distributions. We demonstrate that Chebyshev synopses have three main strong points in comparision to other methods : they aehieve higher accuracy as measured by the L1,L2 and L∞ error metrics; they can be built in only O(NB) time complexity; and Chebyshev synopses can be dynamically maintained. Experimental results on synthetic and real-life data clearly demonstrate the effectiveness of our method.
作者 何海江
出处 《小型微型计算机系统》 CSCD 北大核心 2009年第1期31-36,共6页 Journal of Chinese Computer Systems
关键词 CHEBYSHEV多项式 近似算法 小波 属性分布估计 Chebyshev概要 Chebyshev polynomials approximate algorithm wavelet selectivity estimation Chehyshev synopsis
  • 相关文献

参考文献2

二级参考文献62

  • 1Babcock B, Babu S, Datar M, Motwani R, Widom J. Models and issues in data streams. In: Popa L, ed. Proc. of the 21st ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. Madison: ACM Press, 2002. 1~16.
  • 2Terry D, Goldberg D, Nichols D, Oki B. Continuous queries over append-only databases. SIGMOD Record, 1992,21(2):321-330.
  • 3Avnur R, Hellerstein J. Eddies: Continuously adaptive query processing. In: Chen W, Naughton JF, Bernstein PA, eds. Proc. of the 2000 ACM SIGMOD Int'l Conf. on Management of Data. Dallas: ACM Press, 2000. 261~272.
  • 4Hellerstein J, Franklin M, Chandrasekaran S, Deshpande A, Hildrum K, Madden S, Raman V, Shah MA. Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin, 2000,23(2):7-18.
  • 5Carney D, Cetinternel U, Cherniack M, Convey C, Lee S, Seidman G, Stonebraker M, Tatbul N, Zdonik S. Monitoring streams?A new class of DBMS applications. Technical Report, CS-02-01, Providence: Department of Computer Science, Brown University, 2002.
  • 6Guha S, Mishra N, Motwani R, O'Callaghan L. Clustering data streams. In: Blum A, ed. The 41st Annual Symp. on Foundations of Computer Science, FOCS 2000. Redondo Beach: IEEE Computer Society, 2000. 359-366.
  • 7Domingos P, Hulten G. Mining high-speed data streams. In: Ramakrishnan R, Stolfo S, Pregibon D, eds. Proc. of the 6th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. Boston: ACM Press, 2000. 71-80.
  • 8Domingos P, Hulten G, Spencer L. Mining time-changing data streams. In: Provost F, Srikant R, eds. Proc. of the 7th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. San Francisco: ACM Press, 2001. 97~106.
  • 9Zhou A, Cai Z, Wei L, Qian W. M-Kernel merging: Towards density estimation over data streams. In: Cha SK, Yoshikawa M, eds. The 8th Int'l Conf. on Database Systems for Advanced Applications (DASFAA 2003). Kyoto: IEEE Computer Society, 2003. 285~292.
  • 10Gibbons PB, Matias Y. Synopsis data structures for massive data sets. In: Tarjan RE, Warnow T, eds. Proc. of the 10th Annual ACM-SIAM Symp. on Discrete Algorithms. Baltimore: ACM/SIAM, 1999. 909-910.

共引文献160

同被引文献1

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部