ETL (Extract-Transform-Load) usually includes three phases: extraction, transformation, and loading. In building data warehouse, it plays the role of data injection and is the most time-consuming activity. Thus it ...ETL (Extract-Transform-Load) usually includes three phases: extraction, transformation, and loading. In building data warehouse, it plays the role of data injection and is the most time-consuming activity. Thus it is necessary to improve the performance of ETL. In this paper, a new ETL approach, TEL (Transform-Extract-Load) is proposed. The TEL approach applies virtual tables to realize the transformation stage before extraction stage and loading stage, without data staging area or staging database which stores raw data extracted from each of the disparate source data systems. The TEL approach reduces the data transmission load, and improves the performance of query from access layers. Experimental results based on our proposed benchmarks show that the TEL approach is feasible and practical.展开更多
Virtualization is a common technology for resource sharing in data center. To make efficient use of data center resources, the key challenge is to map customer demands (modeled as virtual data center, VDC) to the ph...Virtualization is a common technology for resource sharing in data center. To make efficient use of data center resources, the key challenge is to map customer demands (modeled as virtual data center, VDC) to the physical data center effectively. In this paper, we focus on this problem. Distinct with previous works, our study of VDC embedding problem is under the assumption that switch resource is the bottleneck of data center networks (DCNs). To this end, we not only propose relative cost to evaluate embedding strategy, decouple embedding problem into VM placement with marginal resource assignment and virtual link mapping with decided source-destination based on the property of fat-tree, but also design the traffic aware embedding algorithm (TAE) and first fit virtual link mapping (FFLM) to map virtual data center requests to a physical data center. Simulation results show that TAE+FFLM could increase acceptance rate and reduce network cost (about 49% in the case) at the same time. The traffie aware embedding algorithm reduces the load of core-link traffic and brings the optimization opportunity for data center network energy conservation.展开更多
Resource Scheduling is crucial to data centers. However, most previous works focus only on one-dimensional resource models which ignoring the fact that multiple resources simultaneously utilized, including CPU, memory...Resource Scheduling is crucial to data centers. However, most previous works focus only on one-dimensional resource models which ignoring the fact that multiple resources simultaneously utilized, including CPU, memory and network bandwidth. As cloud computing allows uncoordinated and heterogeneous users to share a data center, competition for multiple resources has become increasingly severe. Motivated by the differences on integrated utilization obtained from different packing schemes, in this paper we take the scheduling problem as a multi-dimensional combinatorial optimization problem with constraint satisfaction. With NP hardness, we present Multiple attribute decision based Integrated Resource Scheduling (MIRS), and a novel heuristic algorithm to gain the approximate optimal solution. Refers to simulation results, in face of various workload sets, our algorithm has significant superiorities in terms of efficiency and performance compared with previous methods.展开更多
Based on the analysis of data centre(DC) traffic pattern, we introduced a holistic software-defined optical DC solution. Architecture-on-Demand based hybrid optical switched(OPS/OCS) data centre network(DCN) fabric is...Based on the analysis of data centre(DC) traffic pattern, we introduced a holistic software-defined optical DC solution. Architecture-on-Demand based hybrid optical switched(OPS/OCS) data centre network(DCN) fabric is introduced, which is able to realise different inter-and intra-cluster configurations and dynamically support diverse traffic in the DC. The optical DCN is controlled and managed by a software-defined networking(SDN) enabled control plane to achieve high programmability. Moreover, virtual data centre(VDC) composition is developed as an application of such softwaredefined optical DC to create VDC slices for different tenants.展开更多
In Internet of Vehicles(IoV),the security-threat information of various traffic elements can be exploited by hackers to attack vehicles,resulting in accidents,privacy leakage.Consequently,it is necessary to establish ...In Internet of Vehicles(IoV),the security-threat information of various traffic elements can be exploited by hackers to attack vehicles,resulting in accidents,privacy leakage.Consequently,it is necessary to establish security-threat assessment architectures to evaluate risks of traffic elements by managing and sharing securitythreat information.Unfortunately,most assessment architectures process data in a centralized manner,causing delays in query services.To address this issue,in this paper,a Hierarchical Blockchain-enabled Security threat Assessment Architecture(HBSAA)is proposed,utilizing edge chains and global chains to share data.In addition,data virtualization technology is introduced to manage multi-source heterogeneous data,and a metadata association model based on attribute graph is designed to deal with complex data relationships.In order to provide high-speed query service,the ant colony optimization of key nodes is designed,and the HBSAA prototype is also developed and the performance is tested.Experimental results on the large-scale vulnerabilities data gathered from NVD demonstrate that the HBSAA not only shields data heterogeneity,but also reduces service response time.展开更多
Cloud computing has gained significant recognition due to its ability to provide a broad range of online services and applications.Nevertheless,existing commercial cloud computing models demonstrate an appropriate des...Cloud computing has gained significant recognition due to its ability to provide a broad range of online services and applications.Nevertheless,existing commercial cloud computing models demonstrate an appropriate design by concentrating computational assets,such as preservation and server infrastructure,in a limited number of large-scale worldwide data facilities.Optimizing the deployment of virtual machines(VMs)is crucial in this scenario to ensure system dependability,performance,and minimal latency.A significant barrier in the present scenario is the load distribution,particularly when striving for improved energy consumption in a hypothetical grid computing framework.This design employs load-balancing techniques to allocate different user workloads across several virtual machines.To address this challenge,we propose using the twin-fold moth flame technique,which serves as a very effective optimization technique.Developers intentionally designed the twin-fold moth flame method to consider various restrictions,including energy efficiency,lifespan analysis,and resource expenditures.It provides a thorough approach to evaluating total costs in the cloud computing environment.When assessing the efficacy of our suggested strategy,the study will analyze significant metrics such as energy efficiency,lifespan analysis,and resource expenditures.This investigation aims to enhance cloud computing techniques by developing a new optimization algorithm that considers multiple factors for effective virtual machine placement and load balancing.The proposed work demonstrates notable improvements of 12.15%,10.68%,8.70%,13.29%,18.46%,and 33.39%for 40 count data of nodes using the artificial bee colony-bat algorithm,ant colony optimization,crow search algorithm,krill herd,whale optimization genetic algorithm,and improved Lévy-based whale optimization algorithm,respectively.展开更多
The growth of generated data in the industry requires new efficient big data integration approaches for uniform data access by end-users to perform better business operations.Data virtualization systems,including Onto...The growth of generated data in the industry requires new efficient big data integration approaches for uniform data access by end-users to perform better business operations.Data virtualization systems,including Ontology-Based Data Access(ODBA)query data on-the-fly against the original data sources without any prior data materialization.Existing approaches by design use a fixed model e.g.,TABULAR as the only Virtual Data Model-a uniform schema built on-the-fly to load,transform,and join relevant data.While other data models,such as GRAPH or DOCUMENT,are more flexible and,thus,can be more suitable for some common types of queries,such as join or nested queries.Those queries are hard to predict because they depend on many criteria,such as query plan,data model,data size,and operations.To address the problem of selecting the optimal virtual data model for queries on large datasets,we present a new approach that(1)builds on the principal of OBDA to query and join large heterogeneous data in a distributed manner and(2)calls a deep learning method to predict the optimal virtual data model using features extracted from SPARQL queries.OPTIMA-implementation of our approach currently leverages state-of-the-art Big Data technologies,Apache-Spark and Graphx,and implements two virtual data models,GRAPH and TABULAR,and supports out-of-the-box five data sources models:property graph,document-based,e.g.,wide-columnar,relational,and tabular,stored in Neo4j,MongoDB,Cassandra,MySQL,and CSV respectively.Extensive experiments show that our approach is returning the optimal virtual model with an accuracy of 0.831,thus,a reduction in query execution time of over 40%for the tabular model selection and over 30%for the graph model selection.展开更多
This paper describes the modeling and simulation of the protocol of CCSDS advanced orbiting systems (AOS). The network features modeled in the implementation of CCSDS AOS are to multiplex different kinds of sources in...This paper describes the modeling and simulation of the protocol of CCSDS advanced orbiting systems (AOS). The network features modeled in the implementation of CCSDS AOS are to multiplex different kinds of sources into virtual channel data units ( VCDUs) in the data processing module. The emphasis of this work is placed on the algorithm for com-mutating VCDUs into physical channels in the form of continuous data stream. The objectives of modeling CCSDS AOS protocol are to analyze the performance of this protocol when it is used to process various data.展开更多
Cloud computing is becoming a key factor in the market day by day. Therefore, many companies are investing or going to invest in this sector for development of large data centers. These data centers not only consume m...Cloud computing is becoming a key factor in the market day by day. Therefore, many companies are investing or going to invest in this sector for development of large data centers. These data centers not only consume more energy but also produce greenhouse gases. Because of large amount of power consumption, data center providers go for different types of power generator to increase the profit margin which indirectly affects the environment. Several studies are carried out to reduce the power consumption of a data center. One of the techniques to reduce power consumption is virtualization. After several studies, it is stated that hardware plays a very important role. As the load increases, the power consumption of the CPU is also increased. Therefore, by extending the study of virtualization to reduce the power consumption, a hardware-based algorithm for virtual machine provisioning in a private cloud can significantly improve the performance by considering hardware as one of the important factors.展开更多
VOFilter is an XML based filter developed by the Chinese Virtual Observatory project to transform tabular data files from VOTable format into OpenDocument format. VOTable is an XML format defined for the exchange of t...VOFilter is an XML based filter developed by the Chinese Virtual Observatory project to transform tabular data files from VOTable format into OpenDocument format. VOTable is an XML format defined for the exchange of tabular data in the context of the Virtual Observatory (VO). It is the first Proposed Recommendation defined by International Virtual Observatory Alliance, and has obtained wide support from both the VO community and many Astronomy projects. OpenOffice.org is a mature, open source, and front office application suite with the advantage of native support of industrial standard OpenDocument XML file format. Using the VOFilter, VOTable files can be loaded in OpenOffice.org Calc, a spreadsheet application, and then displayed and analyzed as other spreadsheet files. Here, the VOFilter acts as a connector, bridging the coming VO with current industrial office applications. We introduce Virtual Observatory and technical background of the VOFilter. Its workflow, installation and usage are presented. Existing problems and limitations are also discussed together with the future development plans.展开更多
HL-2A DAPS is a large scale special data system for HL-2A tokamak . Technology of network, communication, data acquisition, data processing, real-time display, data management and systems management have been used in ...HL-2A DAPS is a large scale special data system for HL-2A tokamak . Technology of network, communication, data acquisition, data processing, real-time display, data management and systems management have been used in this system. With higher quality products and lower design costs, virtual instrumentation has been widely used in HL-2A DAPS. Vh-tual instrumentation combines mainstream commercial technologies, with flexible software and a wide variety of measurement and control hardware. It's easily to create user-defined systems that meet the exact application needs for the experiment.展开更多
The authors of this paper have previously proposed the global virtual data space system (GVDS) to aggregate the scattered and autonomous storage resources in China’s national supercomputer grid (National Supercomputi...The authors of this paper have previously proposed the global virtual data space system (GVDS) to aggregate the scattered and autonomous storage resources in China’s national supercomputer grid (National Supercomputing Center in Guangzhou, National Supercomputing Center in Jinan, National Supercomputing Center in Changsha, Shanghai Supercomputing Center, and Computer Network Information Center in Chinese Academy of Sciences) into a storage system that spans the wide area network (WAN), which realizes the unified management of global storage resources in China. At present, the GVDS has been successfully deployed in the China National Grid environment. However, when accessing and sharing remote data in the WAN, the GVDS will cause redundant transmission of data and waste a lot of network bandwidth resources. In this paper, we propose an edge cache system as a supplementary system of the GVDS to improve the performance of upper-level applications accessing and sharing remote data. Specifically, we first designs the architecture of the edge cache system, and then study the key technologies of this architecture: the edge cache index mechanism based on double-layer hashing, the edge cache replacement strategy based on the GDSF algorithm, the request routing based on consistent hashing method, and the cluster member maintenance method based on the SWIM protocol. The experimental results show that the edge cache system has successfully implemented the relevant operation functions (read, write, deletion, modification, etc.) and is compatible with the POSIX interface in terms of function. Further, it can greatly reduce the amount of data transmission and increase the data access bandwidth when the accessed file is located at the edge cache system in terms of performance, i.e., its performance is close to the performance of the network file system in the local area network (LAN).展开更多
Virtual Machine(VM) allocation for multiple tenants is an important and challenging problem to provide efficient infrastructure services in cloud data centers. Tenants run applications on their allocated VMs, and th...Virtual Machine(VM) allocation for multiple tenants is an important and challenging problem to provide efficient infrastructure services in cloud data centers. Tenants run applications on their allocated VMs, and the network distance between a tenant's VMs may considerably impact the tenant's Quality of Service(Qo S). In this study, we define and formulate the multi-tenant VM allocation problem in cloud data centers, considering the VM requirements of different tenants, and introducing the allocation goal of minimizing the sum of the VMs' network diameters of all tenants. Then, we propose a Layered Progressive resource allocation algorithm for multi-tenant cloud data centers based on the Multiple Knapsack Problem(LP-MKP). The LP-MKP algorithm uses a multi-stage layered progressive method for multi-tenant VM allocation and efficiently handles unprocessed tenants at each stage. This reduces resource fragmentation in cloud data centers, decreases the differences in the Qo S among tenants, and improves tenants' overall Qo S in cloud data centers. We perform experiments to evaluate the LP-MKP algorithm and demonstrate that it can provide significant gains over other allocation algorithms.展开更多
POTENTIAL is a virtual database machine based on general computing platforms, especially parallel computing platforms. It provides a complete solution to high-performance database systems by a 'virtual processor ...POTENTIAL is a virtual database machine based on general computing platforms, especially parallel computing platforms. It provides a complete solution to high-performance database systems by a 'virtual processor + virtual data bus + virtual memory' architecture. Virtual processors manage all CPU resources in the system, on which various operations are running. Virtual data bus is responsible for the management of data transmission between associated operations, which forms the hinges of the entire system. Virtual memory provides efficient data storage and buffering mechanisms that conform to data reference behaviors in database systems. The architecture of POTENTIAL is very clear and has many good features, including high efficiency, high scalability, high extensibility, high portability, etc.展开更多
We propose a novel technique to increase the confidentiality of an optical code division multiple access (OCDMA) system. A virtual user technique is analyzed and implemented to make an OCDMA system secure. Using thi...We propose a novel technique to increase the confidentiality of an optical code division multiple access (OCDMA) system. A virtual user technique is analyzed and implemented to make an OCDMA system secure. Using this technique, an eavesdropper will never find an isolated authorized user's signal. When authorized users and virtual users transmit data synchronously and asynehronously, network security increases by 25% and 37.5%, respectively.展开更多
Cloud computing emerges as a new computing pattern that can provide elastic services for any users around the world. It provides good chances to solve large scale scientific problems with fewer efforts. Application de...Cloud computing emerges as a new computing pattern that can provide elastic services for any users around the world. It provides good chances to solve large scale scientific problems with fewer efforts. Application deployment remains an important issue in clouds. Appropriate scheduling mechanisms can shorten the total completion time of an application and therefore improve the quality of service(QoS) for cloud users. Unlike current scheduling algorithms which mostly focus on single task allocation, we propose a deadline based scheduling approach for data-intensive applications in clouds. It does not simply consider the total completion time of an application as the sum of all its subtasks' completion time. Not only the computation capacity of virtual machine(VM) is considered, but also the communication delay and data access latencies are taken into account. Simulations show that our proposed approach has a decided advantage over the two other algorithms.展开更多
In this paper,we present the virtual knowledge graph(VKG)paradigm for data integration and access,also known in the literature as Ontology-based Data Access.Instead of structuring the integration layer as a collection...In this paper,we present the virtual knowledge graph(VKG)paradigm for data integration and access,also known in the literature as Ontology-based Data Access.Instead of structuring the integration layer as a collection of relational tables,the VKG paradigm replaces the rigid structure of tables with the flexibility of graphs that are kept virtual and embed domain knowledge.We explain the main notions of this paradigm,its tooling ecosystem and significant use cases in a wide range of applications.Finally,we discuss future research directions.展开更多
文摘ETL (Extract-Transform-Load) usually includes three phases: extraction, transformation, and loading. In building data warehouse, it plays the role of data injection and is the most time-consuming activity. Thus it is necessary to improve the performance of ETL. In this paper, a new ETL approach, TEL (Transform-Extract-Load) is proposed. The TEL approach applies virtual tables to realize the transformation stage before extraction stage and loading stage, without data staging area or staging database which stores raw data extracted from each of the disparate source data systems. The TEL approach reduces the data transmission load, and improves the performance of query from access layers. Experimental results based on our proposed benchmarks show that the TEL approach is feasible and practical.
基金This research was partially supported by the National Grand Fundamental Research 973 Program of China under Grant (No. 2013CB329103), Natural Science Foundation of China grant (No. 61271171), the Fundamental Research Funds for the Central Universities (ZYGX2013J002, ZYGX2012J004, ZYGX2010J002, ZYGX2010J009), Guangdong Science and Technology Project (2012B090500003, 2012B091000163, 2012556031).
文摘Virtualization is a common technology for resource sharing in data center. To make efficient use of data center resources, the key challenge is to map customer demands (modeled as virtual data center, VDC) to the physical data center effectively. In this paper, we focus on this problem. Distinct with previous works, our study of VDC embedding problem is under the assumption that switch resource is the bottleneck of data center networks (DCNs). To this end, we not only propose relative cost to evaluate embedding strategy, decouple embedding problem into VM placement with marginal resource assignment and virtual link mapping with decided source-destination based on the property of fat-tree, but also design the traffic aware embedding algorithm (TAE) and first fit virtual link mapping (FFLM) to map virtual data center requests to a physical data center. Simulation results show that TAE+FFLM could increase acceptance rate and reduce network cost (about 49% in the case) at the same time. The traffie aware embedding algorithm reduces the load of core-link traffic and brings the optimization opportunity for data center network energy conservation.
基金supported in part by National Key Basic Research Program of China (973 program) under Grant No.2011CB302506Important National Science & Technology Specific Projects: Next-Generation Broadband Wireless Mobile Communications Network under Grant No.2011ZX03002-001-01Innovative Research Groups of the National Natural Science Foundation of China under Grant No.60821001
文摘Resource Scheduling is crucial to data centers. However, most previous works focus only on one-dimensional resource models which ignoring the fact that multiple resources simultaneously utilized, including CPU, memory and network bandwidth. As cloud computing allows uncoordinated and heterogeneous users to share a data center, competition for multiple resources has become increasingly severe. Motivated by the differences on integrated utilization obtained from different packing schemes, in this paper we take the scheduling problem as a multi-dimensional combinatorial optimization problem with constraint satisfaction. With NP hardness, we present Multiple attribute decision based Integrated Resource Scheduling (MIRS), and a novel heuristic algorithm to gain the approximate optimal solution. Refers to simulation results, in face of various workload sets, our algorithm has significant superiorities in terms of efficiency and performance compared with previous methods.
基金performed in the Projects " LIGHTNESS : Low latency and high throughput dynamic network infrastructures for high performance datacentre interconnects" (No. 318606) "COSIGN: Combining Optics and SDN In next Generation data centre Networks" (No. 619572) supported by European Commission FP7
文摘Based on the analysis of data centre(DC) traffic pattern, we introduced a holistic software-defined optical DC solution. Architecture-on-Demand based hybrid optical switched(OPS/OCS) data centre network(DCN) fabric is introduced, which is able to realise different inter-and intra-cluster configurations and dynamically support diverse traffic in the DC. The optical DCN is controlled and managed by a software-defined networking(SDN) enabled control plane to achieve high programmability. Moreover, virtual data centre(VDC) composition is developed as an application of such softwaredefined optical DC to create VDC slices for different tenants.
基金supported in part by the Science and Technology Project Program of Sichuan under Grant 2022YFG0022in part by the Science and Technology Research Program of Chongqing Municipal Education Commission under Grant KJZD-K202000602+1 种基金in part by the General Program of Natural Science Foundation of Chongqing under Grant cstc2020jcyj-msxmX1021in part by the Chongqing Natural Science Foundation of China under Grant cstc2020jcyj-msxmX0343.
文摘In Internet of Vehicles(IoV),the security-threat information of various traffic elements can be exploited by hackers to attack vehicles,resulting in accidents,privacy leakage.Consequently,it is necessary to establish security-threat assessment architectures to evaluate risks of traffic elements by managing and sharing securitythreat information.Unfortunately,most assessment architectures process data in a centralized manner,causing delays in query services.To address this issue,in this paper,a Hierarchical Blockchain-enabled Security threat Assessment Architecture(HBSAA)is proposed,utilizing edge chains and global chains to share data.In addition,data virtualization technology is introduced to manage multi-source heterogeneous data,and a metadata association model based on attribute graph is designed to deal with complex data relationships.In order to provide high-speed query service,the ant colony optimization of key nodes is designed,and the HBSAA prototype is also developed and the performance is tested.Experimental results on the large-scale vulnerabilities data gathered from NVD demonstrate that the HBSAA not only shields data heterogeneity,but also reduces service response time.
基金This work was supported in part by the Natural Science Foundation of the Education Department of Henan Province(Grant 22A520025)the National Natural Science Foundation of China(Grant 61975053)the National Key Research and Development of Quality Information Control Technology for Multi-Modal Grain Transportation Efficient Connection(2022YFD2100202).
文摘Cloud computing has gained significant recognition due to its ability to provide a broad range of online services and applications.Nevertheless,existing commercial cloud computing models demonstrate an appropriate design by concentrating computational assets,such as preservation and server infrastructure,in a limited number of large-scale worldwide data facilities.Optimizing the deployment of virtual machines(VMs)is crucial in this scenario to ensure system dependability,performance,and minimal latency.A significant barrier in the present scenario is the load distribution,particularly when striving for improved energy consumption in a hypothetical grid computing framework.This design employs load-balancing techniques to allocate different user workloads across several virtual machines.To address this challenge,we propose using the twin-fold moth flame technique,which serves as a very effective optimization technique.Developers intentionally designed the twin-fold moth flame method to consider various restrictions,including energy efficiency,lifespan analysis,and resource expenditures.It provides a thorough approach to evaluating total costs in the cloud computing environment.When assessing the efficacy of our suggested strategy,the study will analyze significant metrics such as energy efficiency,lifespan analysis,and resource expenditures.This investigation aims to enhance cloud computing techniques by developing a new optimization algorithm that considers multiple factors for effective virtual machine placement and load balancing.The proposed work demonstrates notable improvements of 12.15%,10.68%,8.70%,13.29%,18.46%,and 33.39%for 40 count data of nodes using the artificial bee colony-bat algorithm,ant colony optimization,crow search algorithm,krill herd,whale optimization genetic algorithm,and improved Lévy-based whale optimization algorithm,respectively.
基金the financial support of Fraunhofer Cluster of Excellence (CCIT)
文摘The growth of generated data in the industry requires new efficient big data integration approaches for uniform data access by end-users to perform better business operations.Data virtualization systems,including Ontology-Based Data Access(ODBA)query data on-the-fly against the original data sources without any prior data materialization.Existing approaches by design use a fixed model e.g.,TABULAR as the only Virtual Data Model-a uniform schema built on-the-fly to load,transform,and join relevant data.While other data models,such as GRAPH or DOCUMENT,are more flexible and,thus,can be more suitable for some common types of queries,such as join or nested queries.Those queries are hard to predict because they depend on many criteria,such as query plan,data model,data size,and operations.To address the problem of selecting the optimal virtual data model for queries on large datasets,we present a new approach that(1)builds on the principal of OBDA to query and join large heterogeneous data in a distributed manner and(2)calls a deep learning method to predict the optimal virtual data model using features extracted from SPARQL queries.OPTIMA-implementation of our approach currently leverages state-of-the-art Big Data technologies,Apache-Spark and Graphx,and implements two virtual data models,GRAPH and TABULAR,and supports out-of-the-box five data sources models:property graph,document-based,e.g.,wide-columnar,relational,and tabular,stored in Neo4j,MongoDB,Cassandra,MySQL,and CSV respectively.Extensive experiments show that our approach is returning the optimal virtual model with an accuracy of 0.831,thus,a reduction in query execution time of over 40%for the tabular model selection and over 30%for the graph model selection.
文摘This paper describes the modeling and simulation of the protocol of CCSDS advanced orbiting systems (AOS). The network features modeled in the implementation of CCSDS AOS are to multiplex different kinds of sources into virtual channel data units ( VCDUs) in the data processing module. The emphasis of this work is placed on the algorithm for com-mutating VCDUs into physical channels in the form of continuous data stream. The objectives of modeling CCSDS AOS protocol are to analyze the performance of this protocol when it is used to process various data.
基金supported by the National Research Foundation (NRF) of Korea through contract N-14-NMIR06
文摘Cloud computing is becoming a key factor in the market day by day. Therefore, many companies are investing or going to invest in this sector for development of large data centers. These data centers not only consume more energy but also produce greenhouse gases. Because of large amount of power consumption, data center providers go for different types of power generator to increase the profit margin which indirectly affects the environment. Several studies are carried out to reduce the power consumption of a data center. One of the techniques to reduce power consumption is virtualization. After several studies, it is stated that hardware plays a very important role. As the load increases, the power consumption of the CPU is also increased. Therefore, by extending the study of virtualization to reduce the power consumption, a hardware-based algorithm for virtual machine provisioning in a private cloud can significantly improve the performance by considering hardware as one of the important factors.
基金Supported by the National Natural Science Foundation of China.
文摘VOFilter is an XML based filter developed by the Chinese Virtual Observatory project to transform tabular data files from VOTable format into OpenDocument format. VOTable is an XML format defined for the exchange of tabular data in the context of the Virtual Observatory (VO). It is the first Proposed Recommendation defined by International Virtual Observatory Alliance, and has obtained wide support from both the VO community and many Astronomy projects. OpenOffice.org is a mature, open source, and front office application suite with the advantage of native support of industrial standard OpenDocument XML file format. Using the VOFilter, VOTable files can be loaded in OpenOffice.org Calc, a spreadsheet application, and then displayed and analyzed as other spreadsheet files. Here, the VOFilter acts as a connector, bridging the coming VO with current industrial office applications. We introduce Virtual Observatory and technical background of the VOFilter. Its workflow, installation and usage are presented. Existing problems and limitations are also discussed together with the future development plans.
文摘HL-2A DAPS is a large scale special data system for HL-2A tokamak . Technology of network, communication, data acquisition, data processing, real-time display, data management and systems management have been used in this system. With higher quality products and lower design costs, virtual instrumentation has been widely used in HL-2A DAPS. Vh-tual instrumentation combines mainstream commercial technologies, with flexible software and a wide variety of measurement and control hardware. It's easily to create user-defined systems that meet the exact application needs for the experiment.
基金supported by the National Key Research and Development Program of China(2018YFB0203901)the National Natural Science Foundation of China(Grant No.61772053)+1 种基金the Hebei Youth Talents Support Project(BJ2019008)the Natural Science Foundation of Hebei Province(F2020204003).
文摘The authors of this paper have previously proposed the global virtual data space system (GVDS) to aggregate the scattered and autonomous storage resources in China’s national supercomputer grid (National Supercomputing Center in Guangzhou, National Supercomputing Center in Jinan, National Supercomputing Center in Changsha, Shanghai Supercomputing Center, and Computer Network Information Center in Chinese Academy of Sciences) into a storage system that spans the wide area network (WAN), which realizes the unified management of global storage resources in China. At present, the GVDS has been successfully deployed in the China National Grid environment. However, when accessing and sharing remote data in the WAN, the GVDS will cause redundant transmission of data and waste a lot of network bandwidth resources. In this paper, we propose an edge cache system as a supplementary system of the GVDS to improve the performance of upper-level applications accessing and sharing remote data. Specifically, we first designs the architecture of the edge cache system, and then study the key technologies of this architecture: the edge cache index mechanism based on double-layer hashing, the edge cache replacement strategy based on the GDSF algorithm, the request routing based on consistent hashing method, and the cluster member maintenance method based on the SWIM protocol. The experimental results show that the edge cache system has successfully implemented the relevant operation functions (read, write, deletion, modification, etc.) and is compatible with the POSIX interface in terms of function. Further, it can greatly reduce the amount of data transmission and increase the data access bandwidth when the accessed file is located at the edge cache system in terms of performance, i.e., its performance is close to the performance of the network file system in the local area network (LAN).
基金supported in part by the National Key Basic Research and Development (973) Program of China (No. 2011CB302600)the National Natural Science Foundation of China (No. 61222205)+1 种基金the Program for New Century Excellent Talents in Universitythe Fok Ying-Tong Education Foundation (No. 141066)
文摘Virtual Machine(VM) allocation for multiple tenants is an important and challenging problem to provide efficient infrastructure services in cloud data centers. Tenants run applications on their allocated VMs, and the network distance between a tenant's VMs may considerably impact the tenant's Quality of Service(Qo S). In this study, we define and formulate the multi-tenant VM allocation problem in cloud data centers, considering the VM requirements of different tenants, and introducing the allocation goal of minimizing the sum of the VMs' network diameters of all tenants. Then, we propose a Layered Progressive resource allocation algorithm for multi-tenant cloud data centers based on the Multiple Knapsack Problem(LP-MKP). The LP-MKP algorithm uses a multi-stage layered progressive method for multi-tenant VM allocation and efficiently handles unprocessed tenants at each stage. This reduces resource fragmentation in cloud data centers, decreases the differences in the Qo S among tenants, and improves tenants' overall Qo S in cloud data centers. We perform experiments to evaluate the LP-MKP algorithm and demonstrate that it can provide significant gains over other allocation algorithms.
基金This work is supported by the National .'863' High-Tech Programme under grant! No.863-306-02-04-1the National Natural Scienc
文摘POTENTIAL is a virtual database machine based on general computing platforms, especially parallel computing platforms. It provides a complete solution to high-performance database systems by a 'virtual processor + virtual data bus + virtual memory' architecture. Virtual processors manage all CPU resources in the system, on which various operations are running. Virtual data bus is responsible for the management of data transmission between associated operations, which forms the hinges of the entire system. Virtual memory provides efficient data storage and buffering mechanisms that conform to data reference behaviors in database systems. The architecture of POTENTIAL is very clear and has many good features, including high efficiency, high scalability, high extensibility, high portability, etc.
文摘We propose a novel technique to increase the confidentiality of an optical code division multiple access (OCDMA) system. A virtual user technique is analyzed and implemented to make an OCDMA system secure. Using this technique, an eavesdropper will never find an isolated authorized user's signal. When authorized users and virtual users transmit data synchronously and asynehronously, network security increases by 25% and 37.5%, respectively.
基金supported by the National Natural Science Foundation of China (51507084)the NUPTSF (NY214203)the Natural Science Foundation for Colleges and Universities in Jiangsu Province (14KJB120009)
文摘Cloud computing emerges as a new computing pattern that can provide elastic services for any users around the world. It provides good chances to solve large scale scientific problems with fewer efforts. Application deployment remains an important issue in clouds. Appropriate scheduling mechanisms can shorten the total completion time of an application and therefore improve the quality of service(QoS) for cloud users. Unlike current scheduling algorithms which mostly focus on single task allocation, we propose a deadline based scheduling approach for data-intensive applications in clouds. It does not simply consider the total completion time of an application as the sum of all its subtasks' completion time. Not only the computation capacity of virtual machine(VM) is considered, but also the communication delay and data access latencies are taken into account. Simulations show that our proposed approach has a decided advantage over the two other algorithms.
文摘In this paper,we present the virtual knowledge graph(VKG)paradigm for data integration and access,also known in the literature as Ontology-based Data Access.Instead of structuring the integration layer as a collection of relational tables,the VKG paradigm replaces the rigid structure of tables with the flexibility of graphs that are kept virtual and embed domain knowledge.We explain the main notions of this paradigm,its tooling ecosystem and significant use cases in a wide range of applications.Finally,we discuss future research directions.