期刊文献+

数据流上具有数据遗忘特性的小波概要 被引量:3

Wavelet-Based Amnesic Synopses for Data Streams
下载PDF
导出
摘要 动态地维护数据流的概要结构是数据流查询和挖掘等处理工作的基础.在许多数据流应用场合,数据的影响随时间衰减,流中数据被逐步遗忘,称为数据流的遗忘特性.在数据流概要的构造中,应体现这种特性.离散小波变换是一种应用得较多的数据流概要构造方法.将数据流的遗忘特性引入小波概要的构造中,提出了一种能反映数据流遗忘特性的小波概要结构:基于小波的分层遗忘概要,分别讨论了误差平方和及最大绝对误差两种误差度量标准下这种概要的构造方法.所进行的实验验证了该概要的有效性. Maintaining a synopsis data structure dynamically from data stream is vital for a variety of streaming data applications, such as approximate query or data mining. In many cases, the significance of data item in streams decays with age: this item perhaps conveys critical information first, but, as time goes by, it gets less and less important until it eventually becomes useless. This feature is termed amnesic. Discrete wavelet transform is often used in construction of synopses for streaming data. Proposed in this paper is a wavelet-based hierarchical amnesic synopsis (W-HAS), which includes the amnesic feature of data stream in the generation of wavelet synopses. W-HAS can provide a better approximate representation for data streams with amnesic feature than conventional wavelet synopses. To maintain W-HAS online for evolving data streams, the authors first explore the merger process of two wavelet decompositions, and then implement the addition of data nodes in WHAS structure based on the merger process. Using the addition of data nodes, W-HAS grows dynamically and hierarchically. The construction methods of W-HAS under sum of squared error (sse) and maximum absolute error metrics are discussed. Further, W-HAS with error control is also explore. Finally, experiments on real and synthetic datasets validated the proposed methods.
出处 《计算机研究与发展》 EI CSCD 北大核心 2009年第2期268-279,共12页 Journal of Computer Research and Development
基金 国家自然科学基金项目(60773072) 浙江省自然科学基金项目(Y104144) 浙江省教育厅科研项目(20051737)~~
关键词 概要结构 遗忘特性 离散小波变换 数据流 近似表示 synopses amnesic feature discrete wavelet transform data stream approximation representation
  • 相关文献

参考文献20

  • 1Guha S, Koudas N, Shim K. Data streams and histograms [C] //Proc of the 33rd Annual ACM Symp on Theory of Computing. New York: ACM, 2001:471-475
  • 2Gilbert A, Guha S, Indyk P, et al. Fast, small-space algorithms for approximate histogram maintenance [C]//Proc of the 34th Annual ACM Symp on Theory of Computing. New York: ACM, 2002:389-398
  • 3Gibbons P B, Matias Y. New sampling-based summary statistics for improving approximate query answers [C]// Proc of the ACM SIGMOD Int Conf on Management of Data. New York: ACM, 1998:331-342
  • 4Garofalakis M, Gibbons P B. Wavelet synopses with error guarantees [C] //Proc of the 2002 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2002:476-487
  • 5Gilbert A, Kotidis Y, Muthukrishnan S, et al. One-pass wavelet decompositions of data streams [J]. IEEE Trans on Knowledge and Data Engineering, 2003, 15(3): 541-554
  • 6陈安龙,唐常杰,元昌安,朱明放,段磊.基于小波和偶合特征的多数据流压缩算法[J].软件学报,2007,18(2):177-184. 被引量:6
  • 7刘兵,汪卫,施伯乐.基于小波变换的序列间距离严格估算[J].计算机研究与发展,2006,43(10):1732-1737. 被引量:3
  • 8Cormode G, Muthukrishnan S. An improved data stream summary: The count-min sketch and its applications [J]. Journal of Algorithms, 2005, 55(1): 58-75
  • 9Guha S, Kim C, Shim K. XWAVE: Approximate extended wavelets for streaming data [C]//Proc of the 30th Int Conf on Very Large Data Bases. San Francisco: Morgan Kaufmann, 2004:288-299
  • 10Garofalakis M, K umar A. Deterministic wavelet thresholding for maximum-error metrics [C] //Proe of the 23rd ACM SIGMOD-SIGACT SIGART Symp on Principles of Database Systems. New York: ACM, 2004:166-176

二级参考文献19

  • 1赵慧,侯建荣,施伯乐.随机非平稳时间序列数据的相似性研究(英文)[J].软件学报,2004,15(5):633-640. 被引量:4
  • 2C Faloutsos,M Ranganathan,Y Manolopoulos.Fast subsequence matching in time-series databases[C].ACM SIGMOD 1994,Minneapolis,Minnesota,1994
  • 3R Agrawal,C Faloutsos,A Swami.Efficient similarity search in sequence databases[C].The 4th FODO,Chicago,Illinois,1993
  • 4C Siney Burrus,R A Gopinath,H Guo.Introduction to Wavelets and Wavelet Transforms,A Primer[M].Englewood Cliffs,NJ:Prentice Hall,1997
  • 5Eric J Stollnitz,Tony D Derose,David H Salesin.Wavelets for Computer Graphics[M].San Francisco,CA:Morgan Kaufmann,1996
  • 6Kinpong Chan,Ada Waichee Fu.Efficient time series matching by wavelets[C].ICDE 1999,Sydney,Australia,1999
  • 7Ivan Popivanov,Renee J Miller.Similarity search over time series data using wavelets[C].ICDE 2002,San Jose,CA,2002
  • 8Kaushik Chakrabarti,Minos Garofalakis,Rajeev Rastogi,et al.Approximate query processing using wavelets[J].The VLDB Journal,2001,10(3):199-223
  • 9Dula S,Kim C,Shim K.XWAVE:Optimal and approximate extended wavelets for streaming data.In:Nascimento MA,Kossmann D,eds.Proc.of the 30th Int'l Conf.on Very Large Data Bases.Toronto:Morgan Kaufmann Publishers,2004.288-299.
  • 10Gilbert AC,Kotidis Y,Muthukrishnan S,Strauss MJ.One-Pass wavelet decompositions of data streams.IEEE Trans.on Knowledge and Data Engineering,2003,15(3):541-554.

共引文献6

同被引文献21

  • 1陈隽,徐幼麟.经验模分解在信号趋势项提取中的应用[J].振动.测试与诊断,2005,25(2):101-104. 被引量:57
  • 2蒋鹏,黄清波,尚群立,王智,孙优贤.基于小波网络的数据压缩方法研究[J].仪器仪表学报,2005,26(12):1244-1247. 被引量:5
  • 3刘兵,汪卫,施伯乐.基于小波变换的序列间距离严格估算[J].计算机研究与发展,2006,43(10):1732-1737. 被引量:3
  • 4孙玉芬,卢炎生.流数据挖掘综述[J].计算机科学,2007,34(1):1-5. 被引量:36
  • 5陈安龙,唐常杰,元昌安,朱明放,段磊.基于小波和偶合特征的多数据流压缩算法[J].软件学报,2007,18(2):177-184. 被引量:6
  • 6Gibbons P B, Matias Y.New sampling-based summary statistics for improving approximate query answers[C]//Proc of the ACM SIGMOD Int'l Conf on Management of Data Seattle. [S.l.] :ACM Press, 1998 : 331-342.
  • 7Marcus D, Torsten S.Performance and limitations of the Hilbert-Huang Transformation(HHT) with an application to irregular water waves[J].Ocean Engineering, 2004,31 ( 14/15 ) : 1783 - 1834.
  • 8Peng Z K,Tse Peter W,Chu F L.An improved Hilbert-Huang transform and its application in vibration signal analysis[J]. Journal of Sound and Vibration,2005,286(1/2):187-205.
  • 9Cormode G,Muthukrishnan S.An improved data stream sum-mary:The count-min sketch and its applications[J].Journalof Algorithms,2005,55(1):58-75.
  • 10Guha S,Harb B.Wavelet synopsis for data streams:Minimi-zing non-Euclidean error[C].Proc of the 7th ACM SIGKDDInt Conf on Knowledge Discovery in Data Mining.New York:ACM,2005:88-97.

引证文献3

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部