This paper presents two one-pass algorithms for dynamically computing frequency counts in sliding window over a data stream-computing frequency counts exceeding user-specified threshold ε. The first algorithm constru...This paper presents two one-pass algorithms for dynamically computing frequency counts in sliding window over a data stream-computing frequency counts exceeding user-specified threshold ε. The first algorithm constructs subwindows and deletes expired sub-windows periodically in sliding window, and each sub-window maintains a summary data structure. The first algorithm outputs at most 1/ε + 1 elements for frequency queries over the most recent N elements. The second algorithm adapts multiple levels method to deal with data stream. Once the sketch of the most recent N elements has been constructed, the second algorithm can provides the answers to the frequency queries over the most recent n ( n≤N) elements. The second algorithm outputs at most 1/ε + 2 elements. The analytical and experimental results show that our algorithms are accurate and effective.展开更多
Join operation is a critical problem when dealing with sliding window over data streams. There have been many optimization strategies for sliding window join in the literature, but a simple heuristic is always used fo...Join operation is a critical problem when dealing with sliding window over data streams. There have been many optimization strategies for sliding window join in the literature, but a simple heuristic is always used for selecting the join sequence of many sliding windows, which is ineffectively. The graph-based approach is proposed to process the problem. The sliding window join model is introduced primarily. In this model vertex represent join operator and edge indicated the join relationship among sliding windows. Vertex weight and edge weight represent the cost of join and the reciprocity of join operators respectively. Then good query plan with minimal cost can be found in the model. Thus a complete join algorithm combining setting up model, finding optimal query plan and executing query plan is shown. Experiments show that the graph-based approach is feasible and can work better in above environment.展开更多
How to process aggregate queries over data streams efficiently and effectively have been becoming hot re search topics in both academic community and industrial community. Aiming at the issues, a novel Linked-tree alg...How to process aggregate queries over data streams efficiently and effectively have been becoming hot re search topics in both academic community and industrial community. Aiming at the issues, a novel Linked-tree algorithm based on sliding window is proposed in this paper. Due to the proposal of concept area, the Linked-tree algorithm reuses many primary results in last window and then avoids lots of unnecessary repeated comparison operations between two successive windows. As a result, execution efficiency of MAX query is improved dramatically. In addition, since the size of memory is relevant to the number of areas but irrelevant to the size of sliding window, memory is economized greatly. The extensive experimental results show that the performance of Linked-tree algorithm has significant improvement gains over the traditional SC (Simple Compared) algorithm and Ranked-tree algorithm.展开更多
Processing a join over unbounded input streams requires unbounded memory, since every tuple in one infinite stream must be compared with every tuple in the other. In fact, most join queries over unbounded input stream...Processing a join over unbounded input streams requires unbounded memory, since every tuple in one infinite stream must be compared with every tuple in the other. In fact, most join queries over unbounded input streams are restricted to finite memory due to sliding window constraints. So far, non-indexed and indexed stream equijoin algorithms based on sliding windows have been proposed in many literatures. However, none of them takes non-equijoin into consideration. In many eases, non-equijoin queries occur frequently. Hence, it is worth to discuss how to process non-equijoin queries effectively and efficiently. In this paper, we propose an indexed join algorithm for supporting non-equijoin queries. The experimental results show that our indexed non-equijoin techniques are more efficient than those without index.展开更多
为了开发简单易用的移动测量设备数据采集、处理软件,对目前常用的数据采集、处理软件的优缺点作了分析,并针对存在缺点提出了GeoSolution软件的解决方案。以Windows Mobile 6.0操作系统应用程序的开发环境为平台,搭建GeoSolution软件...为了开发简单易用的移动测量设备数据采集、处理软件,对目前常用的数据采集、处理软件的优缺点作了分析,并针对存在缺点提出了GeoSolution软件的解决方案。以Windows Mobile 6.0操作系统应用程序的开发环境为平台,搭建GeoSolution软件的开发架构,并对GeoSolution软件的开发过程作了阐明,针对软件开发过程中遇到的技术难题提出了解决方法。展开更多
基金Supported by the National Natural Science Foun-dation of China (60403027)
文摘This paper presents two one-pass algorithms for dynamically computing frequency counts in sliding window over a data stream-computing frequency counts exceeding user-specified threshold ε. The first algorithm constructs subwindows and deletes expired sub-windows periodically in sliding window, and each sub-window maintains a summary data structure. The first algorithm outputs at most 1/ε + 1 elements for frequency queries over the most recent N elements. The second algorithm adapts multiple levels method to deal with data stream. Once the sketch of the most recent N elements has been constructed, the second algorithm can provides the answers to the frequency queries over the most recent n ( n≤N) elements. The second algorithm outputs at most 1/ε + 2 elements. The analytical and experimental results show that our algorithms are accurate and effective.
文摘Join operation is a critical problem when dealing with sliding window over data streams. There have been many optimization strategies for sliding window join in the literature, but a simple heuristic is always used for selecting the join sequence of many sliding windows, which is ineffectively. The graph-based approach is proposed to process the problem. The sliding window join model is introduced primarily. In this model vertex represent join operator and edge indicated the join relationship among sliding windows. Vertex weight and edge weight represent the cost of join and the reciprocity of join operators respectively. Then good query plan with minimal cost can be found in the model. Thus a complete join algorithm combining setting up model, finding optimal query plan and executing query plan is shown. Experiments show that the graph-based approach is feasible and can work better in above environment.
基金Supported by the National Natural Science Foun-dation of China (60573089) the National 985 Project Fundation(985-2-DB-Y01)
文摘How to process aggregate queries over data streams efficiently and effectively have been becoming hot re search topics in both academic community and industrial community. Aiming at the issues, a novel Linked-tree algorithm based on sliding window is proposed in this paper. Due to the proposal of concept area, the Linked-tree algorithm reuses many primary results in last window and then avoids lots of unnecessary repeated comparison operations between two successive windows. As a result, execution efficiency of MAX query is improved dramatically. In addition, since the size of memory is relevant to the number of areas but irrelevant to the size of sliding window, memory is economized greatly. The extensive experimental results show that the performance of Linked-tree algorithm has significant improvement gains over the traditional SC (Simple Compared) algorithm and Ranked-tree algorithm.
基金Supported by the National Natural Science Foun-dation of China (60473073)
文摘Processing a join over unbounded input streams requires unbounded memory, since every tuple in one infinite stream must be compared with every tuple in the other. In fact, most join queries over unbounded input streams are restricted to finite memory due to sliding window constraints. So far, non-indexed and indexed stream equijoin algorithms based on sliding windows have been proposed in many literatures. However, none of them takes non-equijoin into consideration. In many eases, non-equijoin queries occur frequently. Hence, it is worth to discuss how to process non-equijoin queries effectively and efficiently. In this paper, we propose an indexed join algorithm for supporting non-equijoin queries. The experimental results show that our indexed non-equijoin techniques are more efficient than those without index.
文摘为了开发简单易用的移动测量设备数据采集、处理软件,对目前常用的数据采集、处理软件的优缺点作了分析,并针对存在缺点提出了GeoSolution软件的解决方案。以Windows Mobile 6.0操作系统应用程序的开发环境为平台,搭建GeoSolution软件的开发架构,并对GeoSolution软件的开发过程作了阐明,针对软件开发过程中遇到的技术难题提出了解决方法。