A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,wh...A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,which is implemented as an extended reservoir-sampling algorithm.A skip factor based on the change ratio of data-values is introduced to describe the distribution characteristics of data-values adaptively.The second step of this method is to partition the fluxes of data streams averagely,which is implemented with two alternative equal-depth histogram generating algorithms that fit the different cases:one for incremental maintenance based on heuristics and the other for periodical updates to generate an approximate partition vector.The experimental results on actual data prove that the method is efficient,practical and suitable for time-varying data streams processing.展开更多
To increase the performance of bulk data transfer mission with ultra-long TCP ( transmission control protocol) connection in high-energy physics experiments, a series of experiments were conducted to explore the way...To increase the performance of bulk data transfer mission with ultra-long TCP ( transmission control protocol) connection in high-energy physics experiments, a series of experiments were conducted to explore the way to enhance the transmission efficiency. This paper introduces the overall structure of RC@ SEU ( regional center @ Southeast University) in AMS (alpha magnetic spectrometer)-02 ground data transfer system as well as the experiments conducted in CERNET (China Education and Research Network)/CERNET2 and global academic Internet. The effects of the number of parallel streams and TCP buffer size are tested. The test confirms that in the current circumstance of CERNET, to find the fight number of parallel TCP connections is the main method to improve the throughput. TCP buffer size tuning has little effect now, but may have good effects when the available bandwidth becomes higher.展开更多
基金The High Technology Research Plan of Jiangsu Prov-ince (No.BG2004034)the Foundation of Graduate Creative Program ofJiangsu Province (No.xm04-36).
文摘A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,which is implemented as an extended reservoir-sampling algorithm.A skip factor based on the change ratio of data-values is introduced to describe the distribution characteristics of data-values adaptively.The second step of this method is to partition the fluxes of data streams averagely,which is implemented with two alternative equal-depth histogram generating algorithms that fit the different cases:one for incremental maintenance based on heuristics and the other for periodical updates to generate an approximate partition vector.The experimental results on actual data prove that the method is efficient,practical and suitable for time-varying data streams processing.
基金The National Basic Research Program of China (973Program) (No.2003CB314803).
文摘To increase the performance of bulk data transfer mission with ultra-long TCP ( transmission control protocol) connection in high-energy physics experiments, a series of experiments were conducted to explore the way to enhance the transmission efficiency. This paper introduces the overall structure of RC@ SEU ( regional center @ Southeast University) in AMS (alpha magnetic spectrometer)-02 ground data transfer system as well as the experiments conducted in CERNET (China Education and Research Network)/CERNET2 and global academic Internet. The effects of the number of parallel streams and TCP buffer size are tested. The test confirms that in the current circumstance of CERNET, to find the fight number of parallel TCP connections is the main method to improve the throughput. TCP buffer size tuning has little effect now, but may have good effects when the available bandwidth becomes higher.