Grid computing is the combination of com- puter resources in a loosely coupled, heterogeneous, and geographically dispersed environment. Grid data are the data used in grid computing, which consists of large-scale dat...Grid computing is the combination of com- puter resources in a loosely coupled, heterogeneous, and geographically dispersed environment. Grid data are the data used in grid computing, which consists of large-scale data-intensive applications, producing and consuming huge amounts of data, distributed across a large number of machines. Data grid computing composes sets of independent tasks each of which require massive distributed data sets that may each be replicated on different resources. To reduce the completion time of the application and improve the performance of the grid, appropriate computing resources should be selected to execute the tasks and appropriate storage resources selected to serve the files required by the tasks. So the problem can be broken into two sub-problems: selection of storage resources and assignment of tasks to computing resources. This paper proposes a scheduler, which is broken into three parts that can run in parallel and uses both parallel tabu search and a parallel genetic algorithm. Finally, the proposed algorithm is evaluated by comparing it with other related algorithms, which target minimizing makespan. Simulation results show that the proposed approach can be a good choice for scheduling large data grid applications.展开更多
In order to reduce makespan and storage consumption in data grids, a node selection model for replica creation is proposed. The model is based on the degree distribution of complex networks. We define two candidate re...In order to reduce makespan and storage consumption in data grids, a node selection model for replica creation is proposed. The model is based on the degree distribution of complex networks. We define two candidate replica nodes: a degree-based candidate pool and a frequency-based candidate pool, through which a degree-based candidate pool is defined in consideration of onsidering the access frequency; a candidate pool-based frequency is also defined. The data replica is copied to the node with the minimum Local cost in the two pools. Further, this paper presents and proves a replica creation theorem. A dynamic multi-replicas creation algorithm (DMRC) is also provided. Simulation results show that the proposed method may simultaneously reduce makespan and data used in space storage consumption.展开更多
Ocean temperature is an important physical variable in marine ecosystems,and ocean temperature prediction is an important research objective in ocean-related fields.Currently,one of the commonly used methods for ocean...Ocean temperature is an important physical variable in marine ecosystems,and ocean temperature prediction is an important research objective in ocean-related fields.Currently,one of the commonly used methods for ocean temperature prediction is based on data-driven,but research on this method is mostly limited to the sea surface,with few studies on the prediction of internal ocean temperature.Existing graph neural network-based methods usually use predefined graphs or learned static graphs,which cannot capture the dynamic associations among data.In this study,we propose a novel dynamic spatiotemporal graph neural network(DSTGN)to predict threedimensional ocean temperature(3D-OT),which combines static graph learning and dynamic graph learning to automatically mine two unknown dependencies between sequences based on the original 3D-OT data without prior knowledge.Temporal and spatial dependencies in the time series were then captured using temporal and graph convolutions.We also integrated dynamic graph learning,static graph learning,graph convolution,and temporal convolution into an end-to-end framework for 3D-OT prediction using time-series grid data.In this study,we conducted prediction experiments using high-resolution 3D-OT from the Copernicus global ocean physical reanalysis,with data covering the vertical variation of temperature from the sea surface to 1000 m below the sea surface.We compared five mainstream models that are commonly used for ocean temperature prediction,and the results showed that the method achieved the best prediction results at all prediction scales.展开更多
Recent studies have demonstrated the importance of LUCC change with climate and ecosystem simulation, but the result could only be determined precisely if a high-resolution underlying land cover map is used. While the...Recent studies have demonstrated the importance of LUCC change with climate and ecosystem simulation, but the result could only be determined precisely if a high-resolution underlying land cover map is used. While the efforts based satellites have provided a good baseline for present land cover, what the next advancement in the research about LUCC change required is the development of reconstruction of historical LUCC change especially spatially-explicit historical dataset. Being different from other similar studies, this study is based on the analysis of historical land use patterns in the traditional cultivated region of China. Taking no account of the less important factors, altitude, slope and population patterns are selected as the major drivers of reclamation in ancient China, and used to design the HCGM (Historical Cropland Gridding Model, at a 60 km×60 km resolution), which is an empirical model for allocating the historical cropland inventory data spatially to grid cells in each political unit. Then we use this model to reconstruct cropland distribution of the study area in 1820, and verify the result by prefectural cropland data of 1820, which is from the historical documents. The statistical analyzing result shows that the model can simulate the patterns of the cropland distribution in the historical period in the traditional cultivated region efficiently.展开更多
This paper describes the architecture of global distributed storage system for data grid. It focue on the management and the capability for the maximum users and maximum resources on the Internet, as well as performan...This paper describes the architecture of global distributed storage system for data grid. It focue on the management and the capability for the maximum users and maximum resources on the Internet, as well as performance and other issues.展开更多
This paper introduces a novel architecture of metadata management system based on intelligent cache called Metadata Intelligent Cache Controller (MICC). By using an intelligent cache to control the metadata system, ...This paper introduces a novel architecture of metadata management system based on intelligent cache called Metadata Intelligent Cache Controller (MICC). By using an intelligent cache to control the metadata system, MICC can deal with different scenarios such as splitting and merging of queries into sub-queries for available metadata sets in local, in order to reduce access time of remote queries. Application can find results patially from local cache and the remaining portion of the metadata that can be fetched from remote locations. Using the existing metadata, it can not only enhance the fault tolerance and load balancing of system effectively, but also improve the efficiency of access while ensuring the access quality.展开更多
This paper proposed a novel multilevel data cache model by Web cache (MDWC) based on network cost in data grid. By constructing a communicating tree of grid sites based on network cost and using a single leader for ...This paper proposed a novel multilevel data cache model by Web cache (MDWC) based on network cost in data grid. By constructing a communicating tree of grid sites based on network cost and using a single leader for each data segment within each region, the MDWC makes the most use of the Web cache of other sites whose bandwidth is as broad as covering the job executing site. The experiment result indicates that the MDWC reduces data response time and data update cost by avoiding network congestions while designing on the parameters concluded by the environment of application.展开更多
Dynamic data replication is a technique used in data grid environments that helps to reduce access latency and network bandwidth utilization. Replication also increases data availability thereby enhancing system relia...Dynamic data replication is a technique used in data grid environments that helps to reduce access latency and network bandwidth utilization. Replication also increases data availability thereby enhancing system reliability. In this paper we discuss the issues with single-location strategies in large-scale data integration applications, and examine potential multiple-location schemes. Dynamic multiple-location replication is NP-complete in nature. We therefore transform the multiple-location problem into several classical mathematical problems with different parameter settings, to which efficient approximation algorithms apply experimental results indicate that unlike single-location strategies our multiple-location schemes are efficient with respect to access latency and bandwidth consumption, especially when the requesters of a data set are distributed over a large scale of locations.展开更多
A smart grid is the evolved form of the power grid with the integration of sensing,communication,computing,monitoring,and control technologies.These technologies make the power grid reliable,efficient,and economical.H...A smart grid is the evolved form of the power grid with the integration of sensing,communication,computing,monitoring,and control technologies.These technologies make the power grid reliable,efficient,and economical.However,the smartness boosts the volume of data in the smart grid.To obligate full benefits,big data has attractive techniques to process and analyze smart grid data.This paper presents and simulates a framework to make sure the use of big data computing technique in the smart grid.The offered framework comprises of the following four layers:(i)Data source layer,(ii)Data transmission layer,(iii)Data storage and computing layer,and(iv)Data analysis layer.As a proof of concept,the framework is simulated by taking the dataset of three cities of the Pakistan region and by considering two cloud-based data centers.The results are analyzed by taking into account the following parameters:(i)Heavy load data center,(ii)The impact of peak hour,(iii)High network delay,and(iv)The low network delay.The presented framework may help the power grid to achieve reliability,sustainability,and cost-efficiency for both the users and service providers.展开更多
Neutral beam injection is one of the effective auxiliary heating methods in magnetic-confinementfusion experiments. In order to acquire the suppressor-grid current signal and avoid the grid being damaged by overheatin...Neutral beam injection is one of the effective auxiliary heating methods in magnetic-confinementfusion experiments. In order to acquire the suppressor-grid current signal and avoid the grid being damaged by overheating, a data acquisition and over-current protection system based on the PXI(PCI e Xtensions for Instrumentation) platform has been developed. The system consists of a current sensor, data acquisition module and over-current protection module. In the data acquisition module,the acquired data of one shot will be transferred in isolation and saved in a data-storage server in a txt file. It can also be recalled using NBWave for future analysis. The over-current protection module contains two modes: remote and local. This gives it the function of setting a threshold voltage remotely and locally, and the forbidden time of over-current protection also can be set by a host PC in remote mode. Experimental results demonstrate that the data acquisition and overcurrent protection system has the advantages of setting forbidden time and isolation transmission.展开更多
Climate research relies heavily on good quality instrumental data; for modeling efforts gridded data are needed. So far, relatively little effort has been made to create gridded climate data for China. This is especia...Climate research relies heavily on good quality instrumental data; for modeling efforts gridded data are needed. So far, relatively little effort has been made to create gridded climate data for China. This is especially true for high-resolution daily data. This work, focuses on identifying an accurate method to produce gridded daily precipitation in China based on the observed data at 753 stations for the period 1951-2005. Five interpolation methods, including ordinary nearest neighbor, local polynomial, radial basis function, inverse distance weighting, and ordinary kriging, have been used and compared. Cross-validation shows that the ordinary kriging based on seasonal semi-variograms gives the best performance, closely followed by the inverse distance weighting with a power of 2. Finally the ordinary kriging is chosen to interpolate the station data to a 18 km× 18 km grid system covering the whole country. Precipitation for each 0.5°×0.5° latitude-longitude block is then obtained by averaging the values at the grid nodes within the block. Owing to the higher station density in the eastern part of the country, the interpolation errors are much smaller than those in the west (west of 100°E). Excluding 145 stations in the western region, the daily, monthly, and annual relative mean absolute errors of the interpolation for the remaining 608 stations are 74%, 29%, and 16%, respectively. The interpolated daily precipitation has been made available on the internet for the scientific community.展开更多
Annual Rossby wave is a key component of the ENSO phenomenon in the equatorial Pacific Ocean. Due to the paucity and seasonal bias in historical hydrographic data,previous studies on equatorial Rossby waves only gave ...Annual Rossby wave is a key component of the ENSO phenomenon in the equatorial Pacific Ocean. Due to the paucity and seasonal bias in historical hydrographic data,previous studies on equatorial Rossby waves only gave qualitative description. The accumulation of Argo measurements in recent years has greatly alleviated the data problem. In this study,seasonal variation of the equatorial Pacific Ocean is examined with annual harmonic analysis of Argo gridded data. Results show that strong seasonal signal is present in the western equatorial Pacific and explains more than 50% of the thermal variance below 500 m. Lag-correlation tracing further shows that this sub-thermocline seasonal signal originates from the eastern equatorial Pacific via downward and southwestward propagation of annual Rossby waves. Possible mechanisms for the equatorward shift of Rossby wave path are also discussed.展开更多
The PACS concept was introduced in 19821, after more than twenty years of technical advancements;it has become an integrated component of today’s healthcare delivery system.PACS is now in the beginning of being used ...The PACS concept was introduced in 19821, after more than twenty years of technical advancements;it has become an integrated component of today’s healthcare delivery system.PACS is now in the beginning of being used as a clinical research tool.2Among others,this paper describes four PACS-based research activities: medical imaging informatics, medical imaging Data Grid, combining PACS and teleradiology operations, and computer-assisted detection and diagnosis(CAD). In medical imaging informatics (MII), we first introduce its infrastructure and the five layer of software architecture. The description of a new MII training program supported by the National Institute of Biomedical Imaging and Bioengineering (NIBIB), National institutes of Health (NIH), USA is followed. The training program accepts candidates with medical and or biomedical engineering background. The goal is to cross-train multi-disciplinary individuals to be future leaders in the field of medical imaging informatics. Grid computing is a new paradigm combining computing, networking, information and storage technologies to advance the conventional distributing computing to the next level. One resource in Grid Computing is the Data Grid. We describe the use of Data Grid concept in medical imaging applications based on the five layer of the open source Globus toolkit 4 (GT4). Three examples are given. First, a Data Grid specifically designed for PACS image backup and disaster recovery developed at the Imaging and Informatics Laboratory (IPI), USC is illustrated. The second application is for image-based clinical trials using three international sites at IPI, USC, USA; the PACS Lab, Hong Kong Polytechnic University, Hong Kong; and the Heart Institute, Sao Paulo, Brazil In combining PACS and teleradiology operations , a Data Grid model is proposed to combinetwo disjoint ,and yet ,daily used PACS and telera-diology operations as one integrated system in a large-scale enterprise level . Methods of combining workflows ,storage ,and reading of PACS and teleradiology i mages are detailed .The last work-in-progress research is the in-tegration of CAD results with daily PACS workflow. The integration methods are based on DICOMScreen Captured and Structured Report Stan-dards ; and several IHE (Integrating the Healthcare Enterprise) Workflow Profiles .展开更多
In recent years, with the rapid development of data intensive applications, data replication has become an enabling technology for the data grid to improve data availability, and reduce file transfer time and bandwidt...In recent years, with the rapid development of data intensive applications, data replication has become an enabling technology for the data grid to improve data availability, and reduce file transfer time and bandwidth consumption. The placement of replicas has been proven to be the most difficult problem that must be solved to realize the process of data replication. This paper addresses the quality of service (QoS) aware replica placement problem in data grid, and proposes a dynamic programming based replica placement algorithm that not only has a QoS requirement guarantee, but also can minimize the overall replication cost, including storage cost and communication cost. By simulation, experiments show that the replica placement algorithm outperforms an existing popular replica placement technique in data grid.展开更多
Data Grid integrates graphically distributed resources for solving data intensive scientific applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a no...Data Grid integrates graphically distributed resources for solving data intensive scientific applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Scheduling is a traditional problem in parallel and distributed system. However, due to special issues and goals of Grid, traditional approach is not effective in this environment any more. Therefore, it is necessary to propose methods specialized for this kind of parallel and distributed system. Another solution is to use a data replication strategy to create multiple copies of files and store them in convenient locations to shorten file access times. To utilize the above two concepts, in this paper we develop a job scheduling policy, called hierarchical job scheduling strategy (HJSS), and a dynamic data replication strategy, called advanced dynamic hierarchical replication strategy (ADHRS), to improve the data access efficiencies in a hierarchical Data Grid. HJSS uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers network characteristics, number of jobs waiting in queue, file locations, and disk read speed of storage drive at data sources. Moreover, due to the limited storage capacity, a good replica replacement algorithm is needed. We present a novel replacement strategy which deletes files in two steps when free space is not enough for the new replica: first, it deletes those files with minimum time for transferring. Second, if space is still insufficient then it considers the last time the replica was requested, number of access, size of replica and file transfer time. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, number of intercommunications, number of replications, hit ratio, computing resource usage and storage usage.展开更多
The present work investigates possible impact of the non-uniformity in observed land surface temperature on trend estimation, based on Climatic Research Unit (CRU) Temperature Version 4 (CRUTEM4) monthly temperatu...The present work investigates possible impact of the non-uniformity in observed land surface temperature on trend estimation, based on Climatic Research Unit (CRU) Temperature Version 4 (CRUTEM4) monthly temperature data-sets from 1900 to 2012. The CRU land temperature data exhibit remarkable non-uniformity in spatial and temporal features. The data are characterized by an uneven spatial distribution of missing records and station density, and dis-play a significant increase of available sites around 1950. Considering the impact of missing data, the trends seem to be more stable and reliable when estimated based on data with 〈 40% missing percent, compared to the data with above 40% missing percent. Mean absolute error (MAE) between data with 〈 40% missing percent and global data is only 0.011℃ (0.014℃) for 1900-50 (1951-2012). The associated trend estimated by reliable data is 0.087℃ decade^-1 (0.186℃ decade^-l) for 1900-50 (1951-2012), almost the same as the trend of the global data. However, due to non-uniform spatial distribution of missing data, the global signal seems mainly coming from the regions with good data coverage, especially for the period 1900-50. This is also confirmed by an extreme test conducted with the records in the United States and Africa. In addition, the influences of spatial and temporal non-uniform features in observation data on trend estimation are significant for the areas with poor data coverage, such as Africa, while insig-nificant for the countries with good data coverage, such as the United States.展开更多
To protect the privacy of power data,we usually encrypt data before outsourcing it to the cloud servers.However,it is challenging to search over the encrypted data.In addition,we need to ensure that only authorized us...To protect the privacy of power data,we usually encrypt data before outsourcing it to the cloud servers.However,it is challenging to search over the encrypted data.In addition,we need to ensure that only authorized users can retrieve the power data.The attribute-based searchable encryption is an advanced technology to solve these problems.However,many existing schemes do not support large universe,expressive access policies,and hidden access policies.In this paper,we propose an attributebased keyword search encryption scheme for power data protection.Firstly,our proposed scheme can support encrypted data retrieval and achieve fine-grained access control.Only authorized users whose attributes satisfy the access policies can search and decrypt the encrypted data.Secondly,to satisfy the requirement in the power grid environment,the proposed scheme can support large attribute universe and hidden access policies.The access policy in this scheme does not leak private information about users.Thirdly,the security analysis and performance analysis indicate that our scheme is efficient and practical.Furthermore,the comparisons with other schemes demonstrate the advantages of our proposed scheme.展开更多
E lement- partition- based methods for visualization of 3D unstructured grid data are presented. First, partition schemes for common elements, including curvilinear tetrahedra, pentahedra, hexahedra, etc., are given, ...E lement- partition- based methods for visualization of 3D unstructured grid data are presented. First, partition schemes for common elements, including curvilinear tetrahedra, pentahedra, hexahedra, etc., are given, so that complex elements can be divided into several rectilinear tetrahedra, and the visualization processes can be simplified.Then, a slice method for cloud map and an iso-surface method based on the partition schemes are described.展开更多
Gridded model assessments require at least one climatic and one soil database for carrying out the simulations.There are several parallel soil and climate database development projects that provide sufficient,albeit c...Gridded model assessments require at least one climatic and one soil database for carrying out the simulations.There are several parallel soil and climate database development projects that provide sufficient,albeit considerably different,observation based input data for crop model based impact studies.The input database related uncertainty of the Biome-BGCMuSo agro-environmental model outputs was investigated using three and four different gridded climatic and soil databases,respectively covering an area of nearly 100.000 km2 with 1104 grid cells.Spatial,temporal,climate and soil database selection related variances were calculated and compared for four model outputs obtained from 30-year-long simulations.The choice of the input database introduced model output variability that was comparable to the variability the year-to-year change of the weather or the spatial heterogeneity of the soil causes.Input database selection could be a decisive factor in carbon sequestration related studies as the soil carbon stock change estimates may either suggest that the simulated ecosystem is a carbon sink or to the contrary a carbon source on the long run.Careful evaluation of the input database quality seems to be an inevitable and highly relevant step towards more realistic plant production and carbon balance simulations.展开更多
Based on the daily precipitation from a 0.5°×0.5° gridded dataset and meteorological stations during 1961-2011 released by National Meteorological Information Center, the reliability of this gridded pre...Based on the daily precipitation from a 0.5°×0.5° gridded dataset and meteorological stations during 1961-2011 released by National Meteorological Information Center, the reliability of this gridded precipitation dataset in South China was evaluated. Five precipitation indices recommended by the World Meteorological Organization (WMO) were selected to investigate the changes in precipitation extremes of South China. The results indicated that the bias between gridded data interpolated to given stations and the corresponding observed data is limited, and the proportion of the number of stations with bias between -10% and 0 is 50.64%. The correlation coefficients between gridded data and observed data are generally above 0.80 in most parts. The average of precipitation indices shows a significant spatial difference with drier northwest section and wetter southeast section. The trend magnitudes of the maximum 5-day precipitation (RX5day), very wet day precipitation (R95), very heavy precipitation days (R20mm) and simple daily intensity index (SDII) are 0.17 mm·a^-1, 1.14 mm·a^-1, 0.02 d·a^-1 and 0.01 mm·d^-1·a^-1, respectively, while consecutive wet days (CWD) decrease by -0.05 d·a^-1 during 1961-2011. There is spatial disparity in trend magnitudes of precipitation indices, and approximate 60.85%, 75.32% and 75.74% of the grid boxes show increasing trends for RX5day, SDII and R95, respectively. There are high correlations between precipitation indices and total precipitation, which is statistically significant at the 0.01 level.展开更多
文摘Grid computing is the combination of com- puter resources in a loosely coupled, heterogeneous, and geographically dispersed environment. Grid data are the data used in grid computing, which consists of large-scale data-intensive applications, producing and consuming huge amounts of data, distributed across a large number of machines. Data grid computing composes sets of independent tasks each of which require massive distributed data sets that may each be replicated on different resources. To reduce the completion time of the application and improve the performance of the grid, appropriate computing resources should be selected to execute the tasks and appropriate storage resources selected to serve the files required by the tasks. So the problem can be broken into two sub-problems: selection of storage resources and assignment of tasks to computing resources. This paper proposes a scheduler, which is broken into three parts that can run in parallel and uses both parallel tabu search and a parallel genetic algorithm. Finally, the proposed algorithm is evaluated by comparing it with other related algorithms, which target minimizing makespan. Simulation results show that the proposed approach can be a good choice for scheduling large data grid applications.
基金supported by the National Natural Science Foundation of China (60973139,60773041)the Key Technologies R&D Program of China (2007BAK34B06)the Talent Foundation of Nanjing Universitiy of Posts and Telecommunications (NY208006)
文摘In order to reduce makespan and storage consumption in data grids, a node selection model for replica creation is proposed. The model is based on the degree distribution of complex networks. We define two candidate replica nodes: a degree-based candidate pool and a frequency-based candidate pool, through which a degree-based candidate pool is defined in consideration of onsidering the access frequency; a candidate pool-based frequency is also defined. The data replica is copied to the node with the minimum Local cost in the two pools. Further, this paper presents and proves a replica creation theorem. A dynamic multi-replicas creation algorithm (DMRC) is also provided. Simulation results show that the proposed method may simultaneously reduce makespan and data used in space storage consumption.
基金The National Key R&D Program of China under contract No.2021YFC3101603.
文摘Ocean temperature is an important physical variable in marine ecosystems,and ocean temperature prediction is an important research objective in ocean-related fields.Currently,one of the commonly used methods for ocean temperature prediction is based on data-driven,but research on this method is mostly limited to the sea surface,with few studies on the prediction of internal ocean temperature.Existing graph neural network-based methods usually use predefined graphs or learned static graphs,which cannot capture the dynamic associations among data.In this study,we propose a novel dynamic spatiotemporal graph neural network(DSTGN)to predict threedimensional ocean temperature(3D-OT),which combines static graph learning and dynamic graph learning to automatically mine two unknown dependencies between sequences based on the original 3D-OT data without prior knowledge.Temporal and spatial dependencies in the time series were then captured using temporal and graph convolutions.We also integrated dynamic graph learning,static graph learning,graph convolution,and temporal convolution into an end-to-end framework for 3D-OT prediction using time-series grid data.In this study,we conducted prediction experiments using high-resolution 3D-OT from the Copernicus global ocean physical reanalysis,with data covering the vertical variation of temperature from the sea surface to 1000 m below the sea surface.We compared five mainstream models that are commonly used for ocean temperature prediction,and the results showed that the method achieved the best prediction results at all prediction scales.
基金Natiional Natural Science Foundation of China,No.40471007Innovation Knowledge Project of CAS,No.KZCX2-YW-315
文摘Recent studies have demonstrated the importance of LUCC change with climate and ecosystem simulation, but the result could only be determined precisely if a high-resolution underlying land cover map is used. While the efforts based satellites have provided a good baseline for present land cover, what the next advancement in the research about LUCC change required is the development of reconstruction of historical LUCC change especially spatially-explicit historical dataset. Being different from other similar studies, this study is based on the analysis of historical land use patterns in the traditional cultivated region of China. Taking no account of the less important factors, altitude, slope and population patterns are selected as the major drivers of reclamation in ancient China, and used to design the HCGM (Historical Cropland Gridding Model, at a 60 km×60 km resolution), which is an empirical model for allocating the historical cropland inventory data spatially to grid cells in each political unit. Then we use this model to reconstruct cropland distribution of the study area in 1820, and verify the result by prefectural cropland data of 1820, which is from the historical documents. The statistical analyzing result shows that the model can simulate the patterns of the cropland distribution in the historical period in the traditional cultivated region efficiently.
文摘This paper describes the architecture of global distributed storage system for data grid. It focue on the management and the capability for the maximum users and maximum resources on the Internet, as well as performance and other issues.
基金Supported by the National High-Technology Re-search and Development Programof China (2002AA1Z2308 ,2002AA118030)the Natural Science Foundation of Liaoning Province(20022027)
文摘This paper introduces a novel architecture of metadata management system based on intelligent cache called Metadata Intelligent Cache Controller (MICC). By using an intelligent cache to control the metadata system, MICC can deal with different scenarios such as splitting and merging of queries into sub-queries for available metadata sets in local, in order to reduce access time of remote queries. Application can find results patially from local cache and the remaining portion of the metadata that can be fetched from remote locations. Using the existing metadata, it can not only enhance the fault tolerance and load balancing of system effectively, but also improve the efficiency of access while ensuring the access quality.
基金Supported by SEC E-Institute :Shanghai HighIn-stitutions Grid Project
文摘This paper proposed a novel multilevel data cache model by Web cache (MDWC) based on network cost in data grid. By constructing a communicating tree of grid sites based on network cost and using a single leader for each data segment within each region, the MDWC makes the most use of the Web cache of other sites whose bandwidth is as broad as covering the job executing site. The experiment result indicates that the MDWC reduces data response time and data update cost by avoiding network congestions while designing on the parameters concluded by the environment of application.
基金the National Natural Science Foundation of China (70671011)the National High-Technology Research and Development Program of China (863 Program) (2007AA04Z1B1)the Social Science Youth Foundation of Chongqing University ( CDSK2007-37)
文摘Dynamic data replication is a technique used in data grid environments that helps to reduce access latency and network bandwidth utilization. Replication also increases data availability thereby enhancing system reliability. In this paper we discuss the issues with single-location strategies in large-scale data integration applications, and examine potential multiple-location schemes. Dynamic multiple-location replication is NP-complete in nature. We therefore transform the multiple-location problem into several classical mathematical problems with different parameter settings, to which efficient approximation algorithms apply experimental results indicate that unlike single-location strategies our multiple-location schemes are efficient with respect to access latency and bandwidth consumption, especially when the requesters of a data set are distributed over a large scale of locations.
基金This work was supported by the National Natural Science Foundation of China(61871058).
文摘A smart grid is the evolved form of the power grid with the integration of sensing,communication,computing,monitoring,and control technologies.These technologies make the power grid reliable,efficient,and economical.However,the smartness boosts the volume of data in the smart grid.To obligate full benefits,big data has attractive techniques to process and analyze smart grid data.This paper presents and simulates a framework to make sure the use of big data computing technique in the smart grid.The offered framework comprises of the following four layers:(i)Data source layer,(ii)Data transmission layer,(iii)Data storage and computing layer,and(iv)Data analysis layer.As a proof of concept,the framework is simulated by taking the dataset of three cities of the Pakistan region and by considering two cloud-based data centers.The results are analyzed by taking into account the following parameters:(i)Heavy load data center,(ii)The impact of peak hour,(iii)High network delay,and(iv)The low network delay.The presented framework may help the power grid to achieve reliability,sustainability,and cost-efficiency for both the users and service providers.
基金supported by National Natural Science Foundation of China(No.11575240)Key Program of Research and Development of Hefei Science Center,CAS(grant 2016HSC-KPRD002)
文摘Neutral beam injection is one of the effective auxiliary heating methods in magnetic-confinementfusion experiments. In order to acquire the suppressor-grid current signal and avoid the grid being damaged by overheating, a data acquisition and over-current protection system based on the PXI(PCI e Xtensions for Instrumentation) platform has been developed. The system consists of a current sensor, data acquisition module and over-current protection module. In the data acquisition module,the acquired data of one shot will be transferred in isolation and saved in a data-storage server in a txt file. It can also be recalled using NBWave for future analysis. The over-current protection module contains two modes: remote and local. This gives it the function of setting a threshold voltage remotely and locally, and the forbidden time of over-current protection also can be set by a host PC in remote mode. Experimental results demonstrate that the data acquisition and overcurrent protection system has the advantages of setting forbidden time and isolation transmission.
基金supported by the Swedish Foundation for International Cooperation in Research and High Education through a grant to D.L.Chen.C.-H.Ho is supported by CATER 2006-4204
文摘Climate research relies heavily on good quality instrumental data; for modeling efforts gridded data are needed. So far, relatively little effort has been made to create gridded climate data for China. This is especially true for high-resolution daily data. This work, focuses on identifying an accurate method to produce gridded daily precipitation in China based on the observed data at 753 stations for the period 1951-2005. Five interpolation methods, including ordinary nearest neighbor, local polynomial, radial basis function, inverse distance weighting, and ordinary kriging, have been used and compared. Cross-validation shows that the ordinary kriging based on seasonal semi-variograms gives the best performance, closely followed by the inverse distance weighting with a power of 2. Finally the ordinary kriging is chosen to interpolate the station data to a 18 km× 18 km grid system covering the whole country. Precipitation for each 0.5°×0.5° latitude-longitude block is then obtained by averaging the values at the grid nodes within the block. Owing to the higher station density in the eastern part of the country, the interpolation errors are much smaller than those in the west (west of 100°E). Excluding 145 stations in the western region, the daily, monthly, and annual relative mean absolute errors of the interpolation for the remaining 608 stations are 74%, 29%, and 16%, respectively. The interpolated daily precipitation has been made available on the internet for the scientific community.
基金Supported by the National Basic Research Program of China(973 Program)(No.2012CB417400)the National Natural Science Foundation of China(Nos.41421005,U1406401)
文摘Annual Rossby wave is a key component of the ENSO phenomenon in the equatorial Pacific Ocean. Due to the paucity and seasonal bias in historical hydrographic data,previous studies on equatorial Rossby waves only gave qualitative description. The accumulation of Argo measurements in recent years has greatly alleviated the data problem. In this study,seasonal variation of the equatorial Pacific Ocean is examined with annual harmonic analysis of Argo gridded data. Results show that strong seasonal signal is present in the western equatorial Pacific and explains more than 50% of the thermal variance below 500 m. Lag-correlation tracing further shows that this sub-thermocline seasonal signal originates from the eastern equatorial Pacific via downward and southwestward propagation of annual Rossby waves. Possible mechanisms for the equatorward shift of Rossby wave path are also discussed.
文摘The PACS concept was introduced in 19821, after more than twenty years of technical advancements;it has become an integrated component of today’s healthcare delivery system.PACS is now in the beginning of being used as a clinical research tool.2Among others,this paper describes four PACS-based research activities: medical imaging informatics, medical imaging Data Grid, combining PACS and teleradiology operations, and computer-assisted detection and diagnosis(CAD). In medical imaging informatics (MII), we first introduce its infrastructure and the five layer of software architecture. The description of a new MII training program supported by the National Institute of Biomedical Imaging and Bioengineering (NIBIB), National institutes of Health (NIH), USA is followed. The training program accepts candidates with medical and or biomedical engineering background. The goal is to cross-train multi-disciplinary individuals to be future leaders in the field of medical imaging informatics. Grid computing is a new paradigm combining computing, networking, information and storage technologies to advance the conventional distributing computing to the next level. One resource in Grid Computing is the Data Grid. We describe the use of Data Grid concept in medical imaging applications based on the five layer of the open source Globus toolkit 4 (GT4). Three examples are given. First, a Data Grid specifically designed for PACS image backup and disaster recovery developed at the Imaging and Informatics Laboratory (IPI), USC is illustrated. The second application is for image-based clinical trials using three international sites at IPI, USC, USA; the PACS Lab, Hong Kong Polytechnic University, Hong Kong; and the Heart Institute, Sao Paulo, Brazil In combining PACS and teleradiology operations , a Data Grid model is proposed to combinetwo disjoint ,and yet ,daily used PACS and telera-diology operations as one integrated system in a large-scale enterprise level . Methods of combining workflows ,storage ,and reading of PACS and teleradiology i mages are detailed .The last work-in-progress research is the in-tegration of CAD results with daily PACS workflow. The integration methods are based on DICOMScreen Captured and Structured Report Stan-dards ; and several IHE (Integrating the Healthcare Enterprise) Workflow Profiles .
基金sponsored by the National Natural Science Foundation of China (61202354)the Hi-Tech Research and Development Program of China (2007AA01Z404)Scientific & Technological Support Project (Industry) of Jiangsu Province (BE2011189)
文摘In recent years, with the rapid development of data intensive applications, data replication has become an enabling technology for the data grid to improve data availability, and reduce file transfer time and bandwidth consumption. The placement of replicas has been proven to be the most difficult problem that must be solved to realize the process of data replication. This paper addresses the quality of service (QoS) aware replica placement problem in data grid, and proposes a dynamic programming based replica placement algorithm that not only has a QoS requirement guarantee, but also can minimize the overall replication cost, including storage cost and communication cost. By simulation, experiments show that the replica placement algorithm outperforms an existing popular replica placement technique in data grid.
文摘Data Grid integrates graphically distributed resources for solving data intensive scientific applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Scheduling is a traditional problem in parallel and distributed system. However, due to special issues and goals of Grid, traditional approach is not effective in this environment any more. Therefore, it is necessary to propose methods specialized for this kind of parallel and distributed system. Another solution is to use a data replication strategy to create multiple copies of files and store them in convenient locations to shorten file access times. To utilize the above two concepts, in this paper we develop a job scheduling policy, called hierarchical job scheduling strategy (HJSS), and a dynamic data replication strategy, called advanced dynamic hierarchical replication strategy (ADHRS), to improve the data access efficiencies in a hierarchical Data Grid. HJSS uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers network characteristics, number of jobs waiting in queue, file locations, and disk read speed of storage drive at data sources. Moreover, due to the limited storage capacity, a good replica replacement algorithm is needed. We present a novel replacement strategy which deletes files in two steps when free space is not enough for the new replica: first, it deletes those files with minimum time for transferring. Second, if space is still insufficient then it considers the last time the replica was requested, number of access, size of replica and file transfer time. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, number of intercommunications, number of replications, hit ratio, computing resource usage and storage usage.
基金Supported by the National Natural Science Foundation of China(41490643 and 41675073)Jiangsu Provincial "333 Talents" Project+2 种基金"Six Talents Highlands" ProjectPriority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)Innovation Project of Jiangsu Province(KYLX16_0927)
文摘The present work investigates possible impact of the non-uniformity in observed land surface temperature on trend estimation, based on Climatic Research Unit (CRU) Temperature Version 4 (CRUTEM4) monthly temperature data-sets from 1900 to 2012. The CRU land temperature data exhibit remarkable non-uniformity in spatial and temporal features. The data are characterized by an uneven spatial distribution of missing records and station density, and dis-play a significant increase of available sites around 1950. Considering the impact of missing data, the trends seem to be more stable and reliable when estimated based on data with 〈 40% missing percent, compared to the data with above 40% missing percent. Mean absolute error (MAE) between data with 〈 40% missing percent and global data is only 0.011℃ (0.014℃) for 1900-50 (1951-2012). The associated trend estimated by reliable data is 0.087℃ decade^-1 (0.186℃ decade^-l) for 1900-50 (1951-2012), almost the same as the trend of the global data. However, due to non-uniform spatial distribution of missing data, the global signal seems mainly coming from the regions with good data coverage, especially for the period 1900-50. This is also confirmed by an extreme test conducted with the records in the United States and Africa. In addition, the influences of spatial and temporal non-uniform features in observation data on trend estimation are significant for the areas with poor data coverage, such as Africa, while insig-nificant for the countries with good data coverage, such as the United States.
基金supported in part by the National Science Foundation of China(62272389)the Shenzhen Fundamental Research Program(20210317191843003)+1 种基金Innovation Foundation for Doctor Dissertation of Northwestern Polytechnical University(CX2022065)Gansu Science and Technology Association Young Science and Technology Talents Lifting Project(GXH20220530-10).
文摘To protect the privacy of power data,we usually encrypt data before outsourcing it to the cloud servers.However,it is challenging to search over the encrypted data.In addition,we need to ensure that only authorized users can retrieve the power data.The attribute-based searchable encryption is an advanced technology to solve these problems.However,many existing schemes do not support large universe,expressive access policies,and hidden access policies.In this paper,we propose an attributebased keyword search encryption scheme for power data protection.Firstly,our proposed scheme can support encrypted data retrieval and achieve fine-grained access control.Only authorized users whose attributes satisfy the access policies can search and decrypt the encrypted data.Secondly,to satisfy the requirement in the power grid environment,the proposed scheme can support large attribute universe and hidden access policies.The access policy in this scheme does not leak private information about users.Thirdly,the security analysis and performance analysis indicate that our scheme is efficient and practical.Furthermore,the comparisons with other schemes demonstrate the advantages of our proposed scheme.
文摘E lement- partition- based methods for visualization of 3D unstructured grid data are presented. First, partition schemes for common elements, including curvilinear tetrahedra, pentahedra, hexahedra, etc., are given, so that complex elements can be divided into several rectilinear tetrahedra, and the visualization processes can be simplified.Then, a slice method for cloud map and an iso-surface method based on the partition schemes are described.
基金supported by Széchenyi 2020 programme,the European Regional Development Fund‘Investing in your future’,the Hungarian Government:[grant number GINOP-2.3.2-15-2016-00028]Hungarian Scientific Research Fund:[grant number FK-128709,K-129118]+1 种基金Advanced research supporting the forestry and wood-processing sector's adaptation to global change and the 4thindustrial revolution[grant number CZ.02.1.01/0.0/0.0/16_019/0000803]financed by Operational Programme Research,Development and EducationJános Bolyai Research Scholarship of the Hungarian Academy of Sciences:[grant number BO/00088/18/4 and BO/00254/20/10].
文摘Gridded model assessments require at least one climatic and one soil database for carrying out the simulations.There are several parallel soil and climate database development projects that provide sufficient,albeit considerably different,observation based input data for crop model based impact studies.The input database related uncertainty of the Biome-BGCMuSo agro-environmental model outputs was investigated using three and four different gridded climatic and soil databases,respectively covering an area of nearly 100.000 km2 with 1104 grid cells.Spatial,temporal,climate and soil database selection related variances were calculated and compared for four model outputs obtained from 30-year-long simulations.The choice of the input database introduced model output variability that was comparable to the variability the year-to-year change of the weather or the spatial heterogeneity of the soil causes.Input database selection could be a decisive factor in carbon sequestration related studies as the soil carbon stock change estimates may either suggest that the simulated ecosystem is a carbon sink or to the contrary a carbon source on the long run.Careful evaluation of the input database quality seems to be an inevitable and highly relevant step towards more realistic plant production and carbon balance simulations.
基金National Basic Research Program of China(973Program),No.2013CBA01801National Natural Science Foundation of China,No.41161012
文摘Based on the daily precipitation from a 0.5°×0.5° gridded dataset and meteorological stations during 1961-2011 released by National Meteorological Information Center, the reliability of this gridded precipitation dataset in South China was evaluated. Five precipitation indices recommended by the World Meteorological Organization (WMO) were selected to investigate the changes in precipitation extremes of South China. The results indicated that the bias between gridded data interpolated to given stations and the corresponding observed data is limited, and the proportion of the number of stations with bias between -10% and 0 is 50.64%. The correlation coefficients between gridded data and observed data are generally above 0.80 in most parts. The average of precipitation indices shows a significant spatial difference with drier northwest section and wetter southeast section. The trend magnitudes of the maximum 5-day precipitation (RX5day), very wet day precipitation (R95), very heavy precipitation days (R20mm) and simple daily intensity index (SDII) are 0.17 mm·a^-1, 1.14 mm·a^-1, 0.02 d·a^-1 and 0.01 mm·d^-1·a^-1, respectively, while consecutive wet days (CWD) decrease by -0.05 d·a^-1 during 1961-2011. There is spatial disparity in trend magnitudes of precipitation indices, and approximate 60.85%, 75.32% and 75.74% of the grid boxes show increasing trends for RX5day, SDII and R95, respectively. There are high correlations between precipitation indices and total precipitation, which is statistically significant at the 0.01 level.