A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,wh...A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,which is implemented as an extended reservoir-sampling algorithm.A skip factor based on the change ratio of data-values is introduced to describe the distribution characteristics of data-values adaptively.The second step of this method is to partition the fluxes of data streams averagely,which is implemented with two alternative equal-depth histogram generating algorithms that fit the different cases:one for incremental maintenance based on heuristics and the other for periodical updates to generate an approximate partition vector.The experimental results on actual data prove that the method is efficient,practical and suitable for time-varying data streams processing.展开更多
The smart grid has caught great attentions in recent years, which is poised to transform a centralized, producer-controlled network to a decentralized, consumer- interactive network that's supported by fine-grained m...The smart grid has caught great attentions in recent years, which is poised to transform a centralized, producer-controlled network to a decentralized, consumer- interactive network that's supported by fine-grained monitoring. Large-scale WSNs (Wireless Sensor Networks) have been considered one of the very promising technologies to support the implementation of smart grid. WSNs are applied in almost every aspect of smart grid, including power generation, power transmission, power distribution, power utilization and power dispatch, and the data query processing of 'WSNs in power grid' become an hotspot issue due to the amount of data of power grid is very large and the requirement of response time is very high. To meet the demands, top-k query processing is a good choice, which performs the cooperative query by aggregating the database objects' degree of match for each different query predicate and returning the best k matching objects. In this paper, a framework that can effectively apply top-k query to wireless sensor network in smart grid is proposed, which is based on the cluster-topology sensor network. In the new method, local indices are used to optimize the necessary query routing and process intermediate results inside the cluster to cut down the data traffic, and the hierarchical join query is executed based on the local results.Besides, top-k query results are verified by the clean-up process, and two schemes are taken to deal with the problem of node's dynamicity, which further reduce communication cost. Case studies and experimental results show that our algorithm has outperformed the current existing one with higher quality results and better efficiently.展开更多
The idea of positional inverted index is exploited for indexing of graph database. The main idea is the use of hashing tables in order to prune a considerable portion of graph database that cannot contain the answer s...The idea of positional inverted index is exploited for indexing of graph database. The main idea is the use of hashing tables in order to prune a considerable portion of graph database that cannot contain the answer set. These tables are implemented using column-based techniques and are used to store graphs of database, frequent sub-graphs and the neighborhood of nodes. In order to exact checking of remaining graphs, the vertex invariant is used for isomorphism test which can be parallel implemented. The results of evaluation indicate that proposed method outperforms existing methods.展开更多
Multidimensional data provides enormous opportunities in a variety of applications. Recent research has indicated the failure of existing sanitization techniques (e.g., k-anonymity) to provide rigorous privacy guara...Multidimensional data provides enormous opportunities in a variety of applications. Recent research has indicated the failure of existing sanitization techniques (e.g., k-anonymity) to provide rigorous privacy guarantees. Privacy- preserving multidimensional data publishing currently lacks a solid theoretical foundation. It is urgent to develop new techniques with provable privacy guarantees, e-Differential privacy is the only method that can provide such guarantees. In this paper, we propose a multidimensional data publishing scheme that ensures c-differential privacy while providing accurate results for query processing. The proposed solution applies nonstandard wavelet transforms on the raw multidimensional data and adds noise to guarantee c-differential privacy. Then, the scheme processes arbitrarily queries directly in the noisy wavelet- coefficient synopses of relational tables and expands the noisy wavelet coefficients back into noisy relational tuples until the end result of the query. Moreover, experimental results demonstrate the high accuracy and effectiveness of our approach.展开更多
XML data can be represented by a tree or graph and the query processing for XML data requires the structural information among nodes. Designing an efficient labeling scheme for the nodes of Order-Sensitive XML trees i...XML data can be represented by a tree or graph and the query processing for XML data requires the structural information among nodes. Designing an efficient labeling scheme for the nodes of Order-Sensitive XML trees is one of the important methods to obtain the excellent management of XML data. Previous labeling schemes such as region and prefix often sacrifice updating performance and suffer increasing labeling space when inserting new nodes. To overcome these limitations, in this paper we propose a new labeling idea of separating structure from order. According to the proposed idea, a novel Prime-based Middle Fraction Labeling Scheme(PMFLS) is designed accordingly, in which a series of algorithms are proposed to obtain the structural relationships among nodes and to support updates. PMFLS combines the advantages of both prefix and region schemes in which the structural information and sequential information are separately expressed. PMFLS also supports Order-Sensitive updates without relabeling or recalculation, and its labeling space is stable. Experiments and analysis on several benchmarks are conducted and the results show that PMFLS is efficient in handling updates and also significantly improves the performance of the query processing with good scalability.展开更多
Image registration is the overlaying of two images of the same scene taken at different times or by different sensors. It is one of the essential steps in information processing in remote sensing. To attain a highly a...Image registration is the overlaying of two images of the same scene taken at different times or by different sensors. It is one of the essential steps in information processing in remote sensing. To attain a highly accurate, reliable and low computation cost in image registration a suitable and similarity metric and reduction in search data and search space is required. In this paper, the author shows that if the right bin size is chosen, mutual information can be more robust than correlation in the registration of multi-temporal images. The author also compares the sensitivity of mutual information and correlation to Gaussian and multiplicative speckle noise. The author investigates automatic subimage selection as a reduction in search data strategy. The author proposes a measure, called alienability, which shows the ability ofa subimage to provide reliable registration. Alternate subimage selection methods such as using gradient, entropy and variance are also investigated. The author furthermore looks into a search space strategy using a gradient approach to maximize mutual information and show our first results.展开更多
The significance of detection of urban active faults and the general situation concerning detection of urban active faults in the world are briefly introduced. In a brief description of the basic principles of anti-di...The significance of detection of urban active faults and the general situation concerning detection of urban active faults in the world are briefly introduced. In a brief description of the basic principles of anti-disturbance and high-resolution shallow seismic exploration, the stress is put on the excitation of seismic sources, the performance of digital seismographs, receiving mode and conditions, geometry as well as data acquisition, processing and interpretation in the anti-disturbance and high-resolution shallow seismic exploration of urban active faults. The study indicates that a controlled seismic source with a linear or nonlinear frequency-conversion scanning function and the relevant seismographs must be used in data acquisition, as well as working methods for small group interval, small offset, multi-channel receiving, short-array and high-frequency detectors for receiving are used. Attention should be paid to the application of techniques for static correction of refraction, noise suppressing, high-precision analysis of velocity, wavelet compressing, zero-phasing of wavelet and pre-stacking migration to data processing and interpretation. Finally, some cases of anti-disturbance and high-resolution shallow seismic exploration of urban active faults are present in the paper.展开更多
Data stream management system (DSMS) provides convenient solutions to the problem of processing continuous queries on data streams.Previous approaches for scheduling these queries and their operators assume that each ...Data stream management system (DSMS) provides convenient solutions to the problem of processing continuous queries on data streams.Previous approaches for scheduling these queries and their operators assume that each operator runs in separate thread or all operators combine in one query plan and run in a single thread.Both approaches suffer from severe drawbacks concerning the thread overhead and the stalls due to expensive operators.To overcome these drawbacks,a novel approach called clustered operators scheduling (COS) is proposed that adaptively clusters operators of the query plan into a number of groups based on their selectivity and computing cost using S-mean clustering.Experimental evaluation is provided to demonstrate the potential benefits of COS scheduling over the other scheduling strategies.COS can provide adaptive,flexible,reliable,scalable and robust design for continuous query processor.展开更多
The clustering of trajectories over huge volumes of streaming data has been rec- ognized as critical for many modem applica- tions. In this work, we propose a continuous clustering of trajectories of moving objects ov...The clustering of trajectories over huge volumes of streaming data has been rec- ognized as critical for many modem applica- tions. In this work, we propose a continuous clustering of trajectories of moving objects over high speed data streams, which updates online trajectory clusters on basis of incremental line- segment clustering. The proposed clustering algorithm obtains trajectory clusters efficiently and stores all closed trajectory clusters in a bi- tree index with efficient search capability. Next, we present two query processing methods by utilising three proposed pruning strategies to fast handle two continuous spatio-temporal queries, threshold-based trajectory clustering queries and threshold-based trajectory outlier detections. Finally, the comprehensive experi- mental studies demonstrate that our algorithm achieves excellent effectiveness and high effi- ciency for continuous clustering on both syn- thetic and real streaming data, and the propo- sed query processing methods utilise average 90% less time than the naive query methods.展开更多
Electromagnetic self-induction theory and computer are adopted and study of online monitoring technique for wire-core belt is conducted, the study shows that there is direct proportion between distance Ⅰ of broken en...Electromagnetic self-induction theory and computer are adopted and study of online monitoring technique for wire-core belt is conducted, the study shows that there is direct proportion between distance Ⅰ of broken ends and output volt Ⅴ, when Ⅰ ≥60 mm, Ⅴ keeps constantly, the running speed v of wire-core belt has no big effect on output volt Ⅴ, there is inverse proportion between the height h from probe to the surface of the belt and output volt Ⅴ, when h≥30 mm, Ⅴ tends to be zero. Based on the test result, on-line monitoring installation is developed, the practice proved that the accuracy of broken wire monitoring can be above 95%, the monitoring accuracy of joint twitch can be 0 .04 Ⅴ/mm.展开更多
Town tourism is booming in China, The town's physical and human environment have been brought great impact with the influx of tourists. This paper took example of Kunming Guandu town, did research studies on both sid...Town tourism is booming in China, The town's physical and human environment have been brought great impact with the influx of tourists. This paper took example of Kunming Guandu town, did research studies on both sides in the core area of the two communities, did survey questionnaire and on-site interviews and used a Likert scale method for data processing and analysis, did analysis of residents participation in tourism situation of Guandu town, tried to find a key impact of the harmonious development of tourism in Guandu town and analyzed, identified the the negative factors, and improved the measures of feasibility in order to improve the Guandu community involvement.展开更多
Manufacturers of chemicals are responsible for setting up a list of tools, including labels and safety data sheets, in order to provide adequate information about dangerous properties being labels and safety data shee...Manufacturers of chemicals are responsible for setting up a list of tools, including labels and safety data sheets, in order to provide adequate information about dangerous properties being labels and safety data sheets the main instruments for the immediate advice about dangerous substances and preparations for general public and workers. While correct labelling gives the possibility to general public to recognise the risks arising from handling and use of dangerous chemicals, safety data sheets are provided for professionals in order to allow safe handling and storage of dangerous chemicals in work places. Information contained in safety data sheets are also designed to suggest safety measures to be taken for the protection of workers as well as precaution measures and adequate actions to be taken in the case of accident. This project has critically revised the information contained in a list of safety data sheets of active ingredients provided for plant protection products, in order to assess the quality and the consistency of the data contained. Reported data have been then compared to published data. Considerable deficiencies/mistakes/inconsistencies have been found in the data reported along the safety data sheets of the examined substances, showing an urgent need of improving the enforcement related to a systematic recognition in this field as well as training of people involved in compilation of safety data sheets by producer side.展开更多
基金The High Technology Research Plan of Jiangsu Prov-ince (No.BG2004034)the Foundation of Graduate Creative Program ofJiangsu Province (No.xm04-36).
文摘A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,which is implemented as an extended reservoir-sampling algorithm.A skip factor based on the change ratio of data-values is introduced to describe the distribution characteristics of data-values adaptively.The second step of this method is to partition the fluxes of data streams averagely,which is implemented with two alternative equal-depth histogram generating algorithms that fit the different cases:one for incremental maintenance based on heuristics and the other for periodical updates to generate an approximate partition vector.The experimental results on actual data prove that the method is efficient,practical and suitable for time-varying data streams processing.
文摘The smart grid has caught great attentions in recent years, which is poised to transform a centralized, producer-controlled network to a decentralized, consumer- interactive network that's supported by fine-grained monitoring. Large-scale WSNs (Wireless Sensor Networks) have been considered one of the very promising technologies to support the implementation of smart grid. WSNs are applied in almost every aspect of smart grid, including power generation, power transmission, power distribution, power utilization and power dispatch, and the data query processing of 'WSNs in power grid' become an hotspot issue due to the amount of data of power grid is very large and the requirement of response time is very high. To meet the demands, top-k query processing is a good choice, which performs the cooperative query by aggregating the database objects' degree of match for each different query predicate and returning the best k matching objects. In this paper, a framework that can effectively apply top-k query to wireless sensor network in smart grid is proposed, which is based on the cluster-topology sensor network. In the new method, local indices are used to optimize the necessary query routing and process intermediate results inside the cluster to cut down the data traffic, and the hierarchical join query is executed based on the local results.Besides, top-k query results are verified by the clean-up process, and two schemes are taken to deal with the problem of node's dynamicity, which further reduce communication cost. Case studies and experimental results show that our algorithm has outperformed the current existing one with higher quality results and better efficiently.
文摘The idea of positional inverted index is exploited for indexing of graph database. The main idea is the use of hashing tables in order to prune a considerable portion of graph database that cannot contain the answer set. These tables are implemented using column-based techniques and are used to store graphs of database, frequent sub-graphs and the neighborhood of nodes. In order to exact checking of remaining graphs, the vertex invariant is used for isomorphism test which can be parallel implemented. The results of evaluation indicate that proposed method outperforms existing methods.
基金the National Basic Research Program of China under Grant 2013CB338004,Doctoral Program of Higher Education of China under Grant No.20120073120034,National Natural Science Foundation of China under Grants No.61070204,61101108,and National S&T Major Program under Grant No.2011ZX03002-005-01
文摘Multidimensional data provides enormous opportunities in a variety of applications. Recent research has indicated the failure of existing sanitization techniques (e.g., k-anonymity) to provide rigorous privacy guarantees. Privacy- preserving multidimensional data publishing currently lacks a solid theoretical foundation. It is urgent to develop new techniques with provable privacy guarantees, e-Differential privacy is the only method that can provide such guarantees. In this paper, we propose a multidimensional data publishing scheme that ensures c-differential privacy while providing accurate results for query processing. The proposed solution applies nonstandard wavelet transforms on the raw multidimensional data and adds noise to guarantee c-differential privacy. Then, the scheme processes arbitrarily queries directly in the noisy wavelet- coefficient synopses of relational tables and expands the noisy wavelet coefficients back into noisy relational tuples until the end result of the query. Moreover, experimental results demonstrate the high accuracy and effectiveness of our approach.
基金supported by the National Science Foundation of China(Grant No.61272067,61370229)the National Key Technology R&D Program of China(Grant No.2012BAH27F05,2013BAH72B01)+1 种基金the National High Technology R&D Program of China(Grant No.2013AA01A212)the S&T Projects of Guangdong Province(Grant No.2016B010109008,2014B010117007,2015A030401087,2015B010109003,2015B010110002)
文摘XML data can be represented by a tree or graph and the query processing for XML data requires the structural information among nodes. Designing an efficient labeling scheme for the nodes of Order-Sensitive XML trees is one of the important methods to obtain the excellent management of XML data. Previous labeling schemes such as region and prefix often sacrifice updating performance and suffer increasing labeling space when inserting new nodes. To overcome these limitations, in this paper we propose a new labeling idea of separating structure from order. According to the proposed idea, a novel Prime-based Middle Fraction Labeling Scheme(PMFLS) is designed accordingly, in which a series of algorithms are proposed to obtain the structural relationships among nodes and to support updates. PMFLS combines the advantages of both prefix and region schemes in which the structural information and sequential information are separately expressed. PMFLS also supports Order-Sensitive updates without relabeling or recalculation, and its labeling space is stable. Experiments and analysis on several benchmarks are conducted and the results show that PMFLS is efficient in handling updates and also significantly improves the performance of the query processing with good scalability.
文摘Image registration is the overlaying of two images of the same scene taken at different times or by different sensors. It is one of the essential steps in information processing in remote sensing. To attain a highly accurate, reliable and low computation cost in image registration a suitable and similarity metric and reduction in search data and search space is required. In this paper, the author shows that if the right bin size is chosen, mutual information can be more robust than correlation in the registration of multi-temporal images. The author also compares the sensitivity of mutual information and correlation to Gaussian and multiplicative speckle noise. The author investigates automatic subimage selection as a reduction in search data strategy. The author proposes a measure, called alienability, which shows the ability ofa subimage to provide reliable registration. Alternate subimage selection methods such as using gradient, entropy and variance are also investigated. The author furthermore looks into a search space strategy using a gradient approach to maximize mutual information and show our first results.
文摘The significance of detection of urban active faults and the general situation concerning detection of urban active faults in the world are briefly introduced. In a brief description of the basic principles of anti-disturbance and high-resolution shallow seismic exploration, the stress is put on the excitation of seismic sources, the performance of digital seismographs, receiving mode and conditions, geometry as well as data acquisition, processing and interpretation in the anti-disturbance and high-resolution shallow seismic exploration of urban active faults. The study indicates that a controlled seismic source with a linear or nonlinear frequency-conversion scanning function and the relevant seismographs must be used in data acquisition, as well as working methods for small group interval, small offset, multi-channel receiving, short-array and high-frequency detectors for receiving are used. Attention should be paid to the application of techniques for static correction of refraction, noise suppressing, high-precision analysis of velocity, wavelet compressing, zero-phasing of wavelet and pre-stacking migration to data processing and interpretation. Finally, some cases of anti-disturbance and high-resolution shallow seismic exploration of urban active faults are present in the paper.
基金Project(50275150) supported by the National Natural Science Foundation of ChinaProject(20040533035) supported by the National Research Foundation for the Doctoral Program of Higher Education of China
文摘Data stream management system (DSMS) provides convenient solutions to the problem of processing continuous queries on data streams.Previous approaches for scheduling these queries and their operators assume that each operator runs in separate thread or all operators combine in one query plan and run in a single thread.Both approaches suffer from severe drawbacks concerning the thread overhead and the stalls due to expensive operators.To overcome these drawbacks,a novel approach called clustered operators scheduling (COS) is proposed that adaptively clusters operators of the query plan into a number of groups based on their selectivity and computing cost using S-mean clustering.Experimental evaluation is provided to demonstrate the potential benefits of COS scheduling over the other scheduling strategies.COS can provide adaptive,flexible,reliable,scalable and robust design for continuous query processor.
基金supported by the National Natural Science Foundation of China under Grants No.61172049,No.61003251the National High Technology Research and Development Program of China(863 Program)under Grant No.2011AA040101the Doctoral Fund of Ministry of Education of Chinaunder Grant No.20100006110015
文摘The clustering of trajectories over huge volumes of streaming data has been rec- ognized as critical for many modem applica- tions. In this work, we propose a continuous clustering of trajectories of moving objects over high speed data streams, which updates online trajectory clusters on basis of incremental line- segment clustering. The proposed clustering algorithm obtains trajectory clusters efficiently and stores all closed trajectory clusters in a bi- tree index with efficient search capability. Next, we present two query processing methods by utilising three proposed pruning strategies to fast handle two continuous spatio-temporal queries, threshold-based trajectory clustering queries and threshold-based trajectory outlier detections. Finally, the comprehensive experi- mental studies demonstrate that our algorithm achieves excellent effectiveness and high effi- ciency for continuous clustering on both syn- thetic and real streaming data, and the propo- sed query processing methods utilise average 90% less time than the naive query methods.
文摘Electromagnetic self-induction theory and computer are adopted and study of online monitoring technique for wire-core belt is conducted, the study shows that there is direct proportion between distance Ⅰ of broken ends and output volt Ⅴ, when Ⅰ ≥60 mm, Ⅴ keeps constantly, the running speed v of wire-core belt has no big effect on output volt Ⅴ, there is inverse proportion between the height h from probe to the surface of the belt and output volt Ⅴ, when h≥30 mm, Ⅴ tends to be zero. Based on the test result, on-line monitoring installation is developed, the practice proved that the accuracy of broken wire monitoring can be above 95%, the monitoring accuracy of joint twitch can be 0 .04 Ⅴ/mm.
文摘Town tourism is booming in China, The town's physical and human environment have been brought great impact with the influx of tourists. This paper took example of Kunming Guandu town, did research studies on both sides in the core area of the two communities, did survey questionnaire and on-site interviews and used a Likert scale method for data processing and analysis, did analysis of residents participation in tourism situation of Guandu town, tried to find a key impact of the harmonious development of tourism in Guandu town and analyzed, identified the the negative factors, and improved the measures of feasibility in order to improve the Guandu community involvement.
文摘Manufacturers of chemicals are responsible for setting up a list of tools, including labels and safety data sheets, in order to provide adequate information about dangerous properties being labels and safety data sheets the main instruments for the immediate advice about dangerous substances and preparations for general public and workers. While correct labelling gives the possibility to general public to recognise the risks arising from handling and use of dangerous chemicals, safety data sheets are provided for professionals in order to allow safe handling and storage of dangerous chemicals in work places. Information contained in safety data sheets are also designed to suggest safety measures to be taken for the protection of workers as well as precaution measures and adequate actions to be taken in the case of accident. This project has critically revised the information contained in a list of safety data sheets of active ingredients provided for plant protection products, in order to assess the quality and the consistency of the data contained. Reported data have been then compared to published data. Considerable deficiencies/mistakes/inconsistencies have been found in the data reported along the safety data sheets of the examined substances, showing an urgent need of improving the enforcement related to a systematic recognition in this field as well as training of people involved in compilation of safety data sheets by producer side.