Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with s...Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with sufficient resources to facilitate efficient network utilization and minimize delays.In such dynamic networks,links frequently fail or get congested,making the recalculation of the shortest paths a computationally intensive problem.Various routing protocols were proposed to overcome this problem by focusing on network utilization rather than speed.Surprisingly,the design of fast shortest-path algorithms for data centers was largely neglected,though they are universal components of routing protocols.Moreover,parallelization techniques were mostly deployed for random network topologies,and not for regular topologies that are often found in data centers.The aim of this paper is to improve scalability and reduce the time required for the shortest-path calculation in data center networks by parallelization on general-purpose hardware.We propose a novel algorithm that parallelizes edge relaxations as a faster and more scalable solution for popular data center topologies.展开更多
A DC (data center) demands air-conditioning power as large as the 1/3-1/2 of total electricity consumption. Thus, energy saving of cooling power of DC yields considerable effect on both economic and environmental vi...A DC (data center) demands air-conditioning power as large as the 1/3-1/2 of total electricity consumption. Thus, energy saving of cooling power of DC yields considerable effect on both economic and environmental views. PV (Photovoltaic) and absorption refrigerator with CGS (cogeneration systems) or gas boiler are possible power saving options. The waste warm air from DC would be utilized for greenhouse heating when DC and greenhouse locate near in the suburbs. In this study, the authors develop an energy network model to assess the potential contribution of DC as a major electric power and chilled air consumer as well as the warm air supplier in a district to the energy efficiency improvement. The evaporation heat of LNG (liquefied natural gas) utilization is also considered as well as PV, CGS. This model is applied to the cases of the urban area in Tokyo which involves athletic center, shops and hospital and the suburbs including greenhouse and then compared.展开更多
Fault detection and diagnosis are essential to the air conditioning system of the data center for elevating reliability and reducing energy consumption.This study proposed a convolutional neural network(CNN)based data...Fault detection and diagnosis are essential to the air conditioning system of the data center for elevating reliability and reducing energy consumption.This study proposed a convolutional neural network(CNN)based data-driven fault detection and diagnosis model considering temporal dependency for composite air conditioning system that is capable of cooling the high heat flux in data centers.The input of fault detection and diagnosis model was an unsteady dataset generated by the experimentally validated transient mathematical model.The dataset concerned three typical faults,including refrigerant leakage,evaporator fan breakdown,and condenser fouling.Then,the CNN model was trained to construct a map between the input and system operating conditions.Further,the performance of the CNN model was validated by comparing it with the support vector machine and the neural network.Finally,the score-weighted class mapping activation method was utilized to interpret model diagnosis mechanisms and to identify key input features in various operating modes.The results demonstrated in the pump-driven heat pipe mode,the accuracy of the CNN model was 99.14%,increasing by around 8.5%compared with the other two methods.In the vapor compression mode,the accuracy of the CNN model achieved 99.9%and declined the miss rate of refrigerant leakage by at least 61%comparatively.The score-weighted class mapping activation results indicated the ambient temperature and the actuator-related parameters,such as compressor frequency in vapor compression mode and condenser fan frequency in pump-driven heat pipe mode,were essential features in system fault detection and diagnosis.展开更多
With the promotion of“dual carbon”strategy,data center(DC)access to high-penetration renewable energy sources(RESs)has become a trend in the industry.However,the uncertainty of RES poses challenges to the safe and s...With the promotion of“dual carbon”strategy,data center(DC)access to high-penetration renewable energy sources(RESs)has become a trend in the industry.However,the uncertainty of RES poses challenges to the safe and stable operation of DCs and power grids.In this paper,a multi-timescale optimal scheduling model is established for interconnected data centers(IDCs)based on model predictive control(MPC),including day-ahead optimization,intraday rolling optimization,and intraday real-time correction.The day-ahead optimization stage aims at the lowest operating cost,the rolling optimization stage aims at the lowest intraday economic cost,and the real-time correction aims at the lowest power fluctuation,eliminating the impact of prediction errors through coordinated multi-timescale optimization.The simulation results show that the economic loss is reduced by 19.6%,and the power fluctuation is decreased by 15.23%.展开更多
Data centers,as the infrastructure of all information services,cost tremendous amount of energy.Reducing the hot spot temperature in the data center room is benefit to prevent overheating of devices,and to increase co...Data centers,as the infrastructure of all information services,cost tremendous amount of energy.Reducing the hot spot temperature in the data center room is benefit to prevent overheating of devices,and to increase cooling system efficiency.In this paper,we study the problem of optimal power distribution among racks for minimal hot spot temperature.The temperature rise matrix(TRM)model is used for the purpose of fast estimation of the thermal environment.The accuracy of the model is evaluated by conducting numerical simulations of computational fluid dynamics(CFD).Using the TRM model,optimal distributing of heating power is converted into a linear programming problem,which can be solved by highly efficient algorithms,such as Simplex.Furthermore,with realistic constraints including rack idle power and power upper limit,an iteration method is proposed to calculate the optimal power distribution along with the optimal on/off states of the racks.Obtained solutions are discussed and validated by comparing with CFD simulations.Results show that the TRM model is acceptable in evaluating temperature rises in the forced-convection-dominated scenarios,and the proposed method is able to obtain optimal power distributions under various levels of total power demand.展开更多
In the cloud age, heterogeneous application modes on large-scale infrastructures bring about the chal- lenges on resource utilization and manageability to data cen- ters. Many resource and runtime management systems a...In the cloud age, heterogeneous application modes on large-scale infrastructures bring about the chal- lenges on resource utilization and manageability to data cen- ters. Many resource and runtime management systems are developed or evolved to address these challenges and rele- vant problems from different perspectives. This paper tries to identify the main motivations, key concerns, common fea- tures, and representative solutions of such systems through a survey and analysis. A typical kind of these systems is gener- alized as the consolidated cluster system, whose design goal is identified as reducing the overall costs under the quality of service premise. A survey on this kind of systems is given, and the critical issues concerned by such systems are sum- marized as resource consolidation and runtime coordination. These two issues are analyzed and classified according to the design styles and external characteristics abstracted from the surveyed work. Five representative consolidated cluster systems from both academia and industry are illustrated and compared in detail based on the analysis and classifications. We hope this survey and analysis to be conducive to both de- sign implementation and technology selection of this kind of systems, in response to the constantly emerging challenges on infrastructure and application management in data centers.展开更多
This paper proposes a distribution locational marginal pricing(DLMP) based bi-level Stackelberg game framework between the internet service company(ISC) and distribution system operator(DSO) in the data center park. T...This paper proposes a distribution locational marginal pricing(DLMP) based bi-level Stackelberg game framework between the internet service company(ISC) and distribution system operator(DSO) in the data center park. To minimize electricity costs, the ISC at the upper level dispatches the interactive workloads(IWs) across different data center buildings spatially and schedules the battery energy storage system temporally in response to DLMP. Photovoltaic generation and static var generation provide extra active and reactive power. At the lower level, DSO calculates the DLMP by minimizing the total electricity cost under the two-part tariff policy and ensures that the distribution network is uncongested and bus voltage is within the limit. The equilibrium solution is obtained by converting the bi-level optimization into a single-level mixed-integer second-order cone programming optimization using the strong duality theorem and the binary expansion method. Case studies verify that the proposed method benefits both the DSO and ISC while preserving the privacy of the ISC. By taking into account the uncertainties in IWs and photovoltaic generation, the flexibility of distribution networks is enhanced, which further facilitates the accommodation of more demand-side resources.展开更多
Data center plays an increasingly important role in everyday life.As data center is becoming more and more powerful,energy consumption is also increasing dramatically.The air conditioning system occupies at least 50 p...Data center plays an increasingly important role in everyday life.As data center is becoming more and more powerful,energy consumption is also increasing dramatically.The air conditioning system occupies at least 50 percent of the total energy consumption.Therefore,delicate analysis on the air conditioning system could help to reduce energy consumption in data center.An advanced Finite Volume Method with RNG k-ε model and convective heat exchange model is used in this paper to study the airflow and the temperature distribution of modular data center under different arrangements.Specifically,the calculation formula of convective heat transfer coefficient for plate flow is adopted to simplify analysis;and fans on the back of racks are simplified to be walls with a certain pressure jump.Simulations reveal that,in the case where air conditioners are arranged face-to-face,the temperature distribution on the back of racks is not uniform,and local high temperature points emerge near the side wall of air conditioners.By analyzing the distribution of air flow and temperature,geometric model is optimized by using a diagonal rack arrangement and drilling holes on the side wall.In the same energy consumption situation,the overall maximum temperature of the optimized model is reduced by 2.3℃ compared with that of the original one,and the maximum temperature on the server surface is reduced by 1℃.Based on the optimized model,the effect of the hot aisle distance on the temperature distribution is studied.By simulating four different cases with various distances of hot aisle of 100cm,120cm,130cm and 150cm,it is found that the temperature is generally lower and distributed more evenly in the case with 120cm hot aisle distance.This demonstrates that the distance of hot aisle has an effect on temperature.展开更多
In wastewater treatment process(WWTP), the accurate and real-time monitoring values of key variables are crucial for the operational strategies. However, most of the existing methods have difficulty in obtaining the r...In wastewater treatment process(WWTP), the accurate and real-time monitoring values of key variables are crucial for the operational strategies. However, most of the existing methods have difficulty in obtaining the real-time values of some key variables in the process. In order to handle this issue, a data-driven intelligent monitoring system, using the soft sensor technique and data distribution service, is developed to monitor the concentrations of effluent total phosphorous(TP) and ammonia nitrogen(NH_4-N). In this intelligent monitoring system, a fuzzy neural network(FNN) is applied for designing the soft sensor model, and a principal component analysis(PCA) method is used to select the input variables of the soft sensor model. Moreover, data transfer software is exploited to insert the soft sensor technique to the supervisory control and data acquisition(SCADA) system. Finally, this proposed intelligent monitoring system is tested in several real plants to demonstrate the reliability and effectiveness of the monitoring performance.展开更多
A data center is an infrastructure that supports Internet service. Cloud comput the face of the Internet service infrastructure, enabling even small organizations to quickly ng is rapidly changing build Web and mobile...A data center is an infrastructure that supports Internet service. Cloud comput the face of the Internet service infrastructure, enabling even small organizations to quickly ng is rapidly changing build Web and mobile applications for millions of users by taking advantage of the scale and flexibility of shared physical infrastructures provided by cloud computing. In this scenario, multiple tenants save their data and applications in shared data centers, blurring the network boundaries between each tenant in the cloud. In addition, different tenants have different security requirements, while different security policies are necessary for different tenants. Network virtualization is used to meet a diverse set of tenant-specific requirements with the underlying physical network enabling multi-tenant datacenters to automatically address a large and diverse set of tenants requirements. In this paper, we propose the system implementation of vCNSMS, a collaborative network security prototype system used n a multi-tenant data center. We demonstrate vCNSMS with a centralized collaborative scheme and deep packet nspection with an open source UTM system. A security level based protection policy is proposed for simplifying the security rule management for vCNSMS. Different security levels have different packet inspection schemes and are enforced with different security plugins. A smart packet verdict scheme is also integrated into vCNSMS for ntelligence flow processing to protect from possible network attacks inside a data center network展开更多
Given the complex nature of data centers’thermal management,which costs too many resources,processing time,and energy consumption,thermal awareness and thermal management powered by artificial intelligence(AI)are the...Given the complex nature of data centers’thermal management,which costs too many resources,processing time,and energy consumption,thermal awareness and thermal management powered by artificial intelligence(AI)are the targeted study.In addition to a few research on AI techniques and models,other strategies have also been introduced in recent years.Data center models,including cooling,thermal,power,and workload models,and their relationship are factors that need to be understood in the optimal thermal management system.Simulation approaches have been proposed to help validate new models or methods used for scheduling and consolidating processes and virtual machines(VMs),hotspot identification,thermal state estimation,and power usage change.AI-powered thermal optimization leads to improved process scheduling and consolidation of VMs and eliminates the hotspot from happening.At present,research on AI-powered thermal control is still in its infancy.This paper concludes with four issues in thermal management,which will be the scope of further research.展开更多
With the deteriorating effects resulting from global warming in many areas, geographically distributed data centers contribute greatly to carbon emissions, because the major energy supply is fossil fuels. Considering ...With the deteriorating effects resulting from global warming in many areas, geographically distributed data centers contribute greatly to carbon emissions, because the major energy supply is fossil fuels. Considering this issue, many geographically distributed data centers are attempting to use clean energy as their energy supply, such as fuel cells and renewable energy sources. However, not all workloads can be powered by a single power sources, since different workloads exhibit different characteristics. In this paper, we propose a fine-grained heterogeneous power distribution model with an objective of minimizing the total energy costs and the sum of the energy gap generated by the geographically distributed data centers powered by multiple types of energy resources. In order to achieve these two goals, we design a two-stage online algorithm to leverage the power supply of each energy source. In each time slot, we also consider a chance-constraint problem and use the Bernstein approximation to solve the problem. Finally, simulation results based on real-world traces illustrate that the proposed algorithm can achieve satisfactory performance.展开更多
Recent developments in cloud computing and big data have spurred the emergence of data-intensive applications for which massive scientific datasets are stored in globally distributed scientific data centers that have ...Recent developments in cloud computing and big data have spurred the emergence of data-intensive applications for which massive scientific datasets are stored in globally distributed scientific data centers that have a high frequency of data access by scientists worldwide. Multiple associated data items distributed in different scientific data centers may be requested for one data processing task, and data placement decisions must respect the storage capacity limits of the scientific data centers. Therefore, the optimization of data access cost in the placement of data items in globally distributed scientific data centers has become an increasingly important goal.Existing data placement approaches for geo-distributed data items are insufficient because they either cannot cope with the cost incurred by the associated data access, or they overlook storage capacity limitations, which are a very practical constraint of scientific data centers. In this paper, inspired by applications in the field of high energy physics, we propose an integer-programming-based data placement model that addresses the above challenges as a Non-deterministic Polynomial-time(NP)-hard problem. In addition we use a Lagrangian relaxation based heuristics algorithm to obtain ideal data placement solutions. Our simulation results demonstrate that our algorithm is effective and significantly reduces overall data access cost.展开更多
With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes ...With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes difficult.The multi-FPGA acceleration is of great necessity and importance.Many cloud providers(e.g.,Amazon,Microsoft,and Baidu)now expose FPGAs to users in their data centers,providing opportunities to accelerate large-scale graph processing.In this paper,we present a communication library,called FDGLib,which can easily scale out any existing single FPGA-based graph accelerator to a distributed version in a data center,with minimal hardware engineering efforts.FDGLib provides six APIs that can be easily used and integrated into any FPGA-based graph accelerator with only a few lines of code modifications.Considering the torus-based FPGA interconnection in data centers,FDGLib also improves communication efficiency using simple yet effective torus-friendly graph partition and placement schemes.We interface FDGLib into AccuGraph,a state-of-the-art graph accelerator.Our results on a 32-node Microsoft Catapult-like data center show that the distributed AccuGraph can be 2.32x and 4.77x faster than a state-of-the-art distributed FPGA-based graph accelerator ForeGraph and a distributed CPU-based graph system Gemini,with better scalability.展开更多
To support the needs of ever-growing cloudbased services,the number of servers and network devices in data centers is increasing exponentially,which in turn results in high complexities and difficulties in network opt...To support the needs of ever-growing cloudbased services,the number of servers and network devices in data centers is increasing exponentially,which in turn results in high complexities and difficulties in network optimization.Machine learning(ML)provides an effective way to deal with these challenges by enabling network intelligence.To this end,numerous creative ML-based approaches have been put forward in recent years.Nevertheless,the intelligent optimization of data center networks(DCN)still faces enormous challenges.To the best of our knowledge,there is a lack of systematic and original investigations with in-depth analysis on intelligent DCN.To this end,in this paper,we investigate the application of ML to DCN optimization and provide a general overview and in-depth analysis of the recent works,covering flow prediction,flow classification,and resource management.Moreover,we also give unique insights into the technology evolution of the fusion of DCN and ML,together with some challenges and future research opportunities.展开更多
The important issues of network TCP congestion control are how to compute the link price according to the link status and regulate the data sending rate based on link congestion pricing feedback information.However,it...The important issues of network TCP congestion control are how to compute the link price according to the link status and regulate the data sending rate based on link congestion pricing feedback information.However,it is difficult to predict the congestion state of the link-end accurately at the source.In this paper,we presented an improved NUMFabric algorithm for calculating the overall congestion price.In the proposed scheme,the whole network structure had been obtained by the central control server in the Software Defined Network,and a kind of dual-hierarchy algorithm for calculating overall network congestion price had been demonstrated.In this scheme,the first hierarchy algorithm was set up in a central control server like Opendaylight and the guiding parameter B is obtained based on the intelligent data of global link state information.Based on the historical data,the congestion state of the network and the guiding parameter B is accurately predicted by the machine learning algorithm.The second hierarchy algorithm was installed in the Openflow link and the link price was calculated based on guiding parameter B given by the first algorithm.We evaluate this evolved NUMFabric algorithm in NS3,which demonstrated that the proposed NUMFabric algorithm could efficiently increase the link bandwidth utilization of cloud computing IoT datacenters.展开更多
With the growing popularity of cloud-based data center networks (DCNs),task resource allocation has become more and more important to the efficient use of resource in DCNs.This paper considers provisioning the maximum...With the growing popularity of cloud-based data center networks (DCNs),task resource allocation has become more and more important to the efficient use of resource in DCNs.This paper considers provisioning the maximum admissible load (MAL)of virtual machines (VMs)in physical machines (PMs)with underlying tree-structured DCNs using the hose model for communication.The limitation of static load distribution is that it assigns tasks to nodes in a once-and-for-all manner,and thus requires a priori knowledge of program behavior.To avoid load redistribution during runtime when the load grows,we introduce maximum elasticity scheduling,which has the maximum growth potential subject to the node and link capacities.This paper aims to find the schedule with the maximum elasticity across nodes and links.We first propose a distributed linear solution based on message passing,and we discuss several properties and extensions of the model.Based on the assumptions and conclusions,we extend it to the multiple paths case with a fat tree DCN,and discuss the optimal solution for computing the MAL with both computation and communication constraints.After that,we present the provision scheme with the maximum elasticity for the VMs,which comes with provable optimality guarantee for a fixed flow scheduling strategy in a fat tree DCN.We conduct the evaluations on our testbed and present various simulation results by comparing the proposed maximum elastic scheduling schemes with other methods.Extensive simulations validate the effectiveness of the proposed policies,and the results are shown from different perspectives to provide solutions based on our research.展开更多
基金This work was supported by the Serbian Ministry of Science and Education(project TR-32022)by companies Telekom Srbija and Informatika.
文摘Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with sufficient resources to facilitate efficient network utilization and minimize delays.In such dynamic networks,links frequently fail or get congested,making the recalculation of the shortest paths a computationally intensive problem.Various routing protocols were proposed to overcome this problem by focusing on network utilization rather than speed.Surprisingly,the design of fast shortest-path algorithms for data centers was largely neglected,though they are universal components of routing protocols.Moreover,parallelization techniques were mostly deployed for random network topologies,and not for regular topologies that are often found in data centers.The aim of this paper is to improve scalability and reduce the time required for the shortest-path calculation in data center networks by parallelization on general-purpose hardware.We propose a novel algorithm that parallelizes edge relaxations as a faster and more scalable solution for popular data center topologies.
文摘A DC (data center) demands air-conditioning power as large as the 1/3-1/2 of total electricity consumption. Thus, energy saving of cooling power of DC yields considerable effect on both economic and environmental views. PV (Photovoltaic) and absorption refrigerator with CGS (cogeneration systems) or gas boiler are possible power saving options. The waste warm air from DC would be utilized for greenhouse heating when DC and greenhouse locate near in the suburbs. In this study, the authors develop an energy network model to assess the potential contribution of DC as a major electric power and chilled air consumer as well as the warm air supplier in a district to the energy efficiency improvement. The evaporation heat of LNG (liquefied natural gas) utilization is also considered as well as PV, CGS. This model is applied to the cases of the urban area in Tokyo which involves athletic center, shops and hospital and the suburbs including greenhouse and then compared.
基金the support from the National Natural Science Foundation of China(Grant number 52176180)the support from“the open competition mechanism to select the best candidates”key technology project of Liaoning(Grant 2022JH1/10800008).
文摘Fault detection and diagnosis are essential to the air conditioning system of the data center for elevating reliability and reducing energy consumption.This study proposed a convolutional neural network(CNN)based data-driven fault detection and diagnosis model considering temporal dependency for composite air conditioning system that is capable of cooling the high heat flux in data centers.The input of fault detection and diagnosis model was an unsteady dataset generated by the experimentally validated transient mathematical model.The dataset concerned three typical faults,including refrigerant leakage,evaporator fan breakdown,and condenser fouling.Then,the CNN model was trained to construct a map between the input and system operating conditions.Further,the performance of the CNN model was validated by comparing it with the support vector machine and the neural network.Finally,the score-weighted class mapping activation method was utilized to interpret model diagnosis mechanisms and to identify key input features in various operating modes.The results demonstrated in the pump-driven heat pipe mode,the accuracy of the CNN model was 99.14%,increasing by around 8.5%compared with the other two methods.In the vapor compression mode,the accuracy of the CNN model achieved 99.9%and declined the miss rate of refrigerant leakage by at least 61%comparatively.The score-weighted class mapping activation results indicated the ambient temperature and the actuator-related parameters,such as compressor frequency in vapor compression mode and condenser fan frequency in pump-driven heat pipe mode,were essential features in system fault detection and diagnosis.
文摘With the promotion of“dual carbon”strategy,data center(DC)access to high-penetration renewable energy sources(RESs)has become a trend in the industry.However,the uncertainty of RES poses challenges to the safe and stable operation of DCs and power grids.In this paper,a multi-timescale optimal scheduling model is established for interconnected data centers(IDCs)based on model predictive control(MPC),including day-ahead optimization,intraday rolling optimization,and intraday real-time correction.The day-ahead optimization stage aims at the lowest operating cost,the rolling optimization stage aims at the lowest intraday economic cost,and the real-time correction aims at the lowest power fluctuation,eliminating the impact of prediction errors through coordinated multi-timescale optimization.The simulation results show that the economic loss is reduced by 19.6%,and the power fluctuation is decreased by 15.23%.
基金supported by the Project of Shanghai Municipal Science and Technology Commission (No.22DZ2291100)the National Natural Science Foundation of China (No.51976062)the Opening Project of the Key Laboratory of Heat Transfer Enhancement and Energy Conservation of Education Ministry (South China University of Technology,No.202000105).
文摘Data centers,as the infrastructure of all information services,cost tremendous amount of energy.Reducing the hot spot temperature in the data center room is benefit to prevent overheating of devices,and to increase cooling system efficiency.In this paper,we study the problem of optimal power distribution among racks for minimal hot spot temperature.The temperature rise matrix(TRM)model is used for the purpose of fast estimation of the thermal environment.The accuracy of the model is evaluated by conducting numerical simulations of computational fluid dynamics(CFD).Using the TRM model,optimal distributing of heating power is converted into a linear programming problem,which can be solved by highly efficient algorithms,such as Simplex.Furthermore,with realistic constraints including rack idle power and power upper limit,an iteration method is proposed to calculate the optimal power distribution along with the optimal on/off states of the racks.Obtained solutions are discussed and validated by comparing with CFD simulations.Results show that the TRM model is acceptable in evaluating temperature rises in the forced-convection-dominated scenarios,and the proposed method is able to obtain optimal power distributions under various levels of total power demand.
文摘In the cloud age, heterogeneous application modes on large-scale infrastructures bring about the chal- lenges on resource utilization and manageability to data cen- ters. Many resource and runtime management systems are developed or evolved to address these challenges and rele- vant problems from different perspectives. This paper tries to identify the main motivations, key concerns, common fea- tures, and representative solutions of such systems through a survey and analysis. A typical kind of these systems is gener- alized as the consolidated cluster system, whose design goal is identified as reducing the overall costs under the quality of service premise. A survey on this kind of systems is given, and the critical issues concerned by such systems are sum- marized as resource consolidation and runtime coordination. These two issues are analyzed and classified according to the design styles and external characteristics abstracted from the surveyed work. Five representative consolidated cluster systems from both academia and industry are illustrated and compared in detail based on the analysis and classifications. We hope this survey and analysis to be conducive to both de- sign implementation and technology selection of this kind of systems, in response to the constantly emerging challenges on infrastructure and application management in data centers.
基金supported in part by the 2021 Graduate Research and Innovation Program of Jiangsu,China (No.KYCX21_0473)the China Scholarship Council (CSC) Program (No.202106710110)。
文摘This paper proposes a distribution locational marginal pricing(DLMP) based bi-level Stackelberg game framework between the internet service company(ISC) and distribution system operator(DSO) in the data center park. To minimize electricity costs, the ISC at the upper level dispatches the interactive workloads(IWs) across different data center buildings spatially and schedules the battery energy storage system temporally in response to DLMP. Photovoltaic generation and static var generation provide extra active and reactive power. At the lower level, DSO calculates the DLMP by minimizing the total electricity cost under the two-part tariff policy and ensures that the distribution network is uncongested and bus voltage is within the limit. The equilibrium solution is obtained by converting the bi-level optimization into a single-level mixed-integer second-order cone programming optimization using the strong duality theorem and the binary expansion method. Case studies verify that the proposed method benefits both the DSO and ISC while preserving the privacy of the ISC. By taking into account the uncertainties in IWs and photovoltaic generation, the flexibility of distribution networks is enhanced, which further facilitates the accommodation of more demand-side resources.
基金The authors would like to thank China Scholarship Council(CSC),Aeronautics Science Foundation(No.20163252037)Fundamental Research Funds for the Central Universities(No.NP2017202)for their support。
文摘Data center plays an increasingly important role in everyday life.As data center is becoming more and more powerful,energy consumption is also increasing dramatically.The air conditioning system occupies at least 50 percent of the total energy consumption.Therefore,delicate analysis on the air conditioning system could help to reduce energy consumption in data center.An advanced Finite Volume Method with RNG k-ε model and convective heat exchange model is used in this paper to study the airflow and the temperature distribution of modular data center under different arrangements.Specifically,the calculation formula of convective heat transfer coefficient for plate flow is adopted to simplify analysis;and fans on the back of racks are simplified to be walls with a certain pressure jump.Simulations reveal that,in the case where air conditioners are arranged face-to-face,the temperature distribution on the back of racks is not uniform,and local high temperature points emerge near the side wall of air conditioners.By analyzing the distribution of air flow and temperature,geometric model is optimized by using a diagonal rack arrangement and drilling holes on the side wall.In the same energy consumption situation,the overall maximum temperature of the optimized model is reduced by 2.3℃ compared with that of the original one,and the maximum temperature on the server surface is reduced by 1℃.Based on the optimized model,the effect of the hot aisle distance on the temperature distribution is studied.By simulating four different cases with various distances of hot aisle of 100cm,120cm,130cm and 150cm,it is found that the temperature is generally lower and distributed more evenly in the case with 120cm hot aisle distance.This demonstrates that the distance of hot aisle has an effect on temperature.
基金Supported by the National Natural Science Foundation of China(61622301,61533002)Beijing Natural Science Foundation(4172005)Major National Science and Technology Project(2017ZX07104)
文摘In wastewater treatment process(WWTP), the accurate and real-time monitoring values of key variables are crucial for the operational strategies. However, most of the existing methods have difficulty in obtaining the real-time values of some key variables in the process. In order to handle this issue, a data-driven intelligent monitoring system, using the soft sensor technique and data distribution service, is developed to monitor the concentrations of effluent total phosphorous(TP) and ammonia nitrogen(NH_4-N). In this intelligent monitoring system, a fuzzy neural network(FNN) is applied for designing the soft sensor model, and a principal component analysis(PCA) method is used to select the input variables of the soft sensor model. Moreover, data transfer software is exploited to insert the soft sensor technique to the supervisory control and data acquisition(SCADA) system. Finally, this proposed intelligent monitoring system is tested in several real plants to demonstrate the reliability and effectiveness of the monitoring performance.
基金supported in part by the National Key Basic Research and Development(973)Program of China(Nos.2013CB228206 and 2012CB315801)the National Natural Science Foundation of China(Nos.61233016 and 61140320)+1 种基金supported by the Intel Research Council with the title of "Security Vulnerability Analysis based on Cloud Platform with Intel IA Architecture"Huawei Corp
文摘A data center is an infrastructure that supports Internet service. Cloud comput the face of the Internet service infrastructure, enabling even small organizations to quickly ng is rapidly changing build Web and mobile applications for millions of users by taking advantage of the scale and flexibility of shared physical infrastructures provided by cloud computing. In this scenario, multiple tenants save their data and applications in shared data centers, blurring the network boundaries between each tenant in the cloud. In addition, different tenants have different security requirements, while different security policies are necessary for different tenants. Network virtualization is used to meet a diverse set of tenant-specific requirements with the underlying physical network enabling multi-tenant datacenters to automatically address a large and diverse set of tenants requirements. In this paper, we propose the system implementation of vCNSMS, a collaborative network security prototype system used n a multi-tenant data center. We demonstrate vCNSMS with a centralized collaborative scheme and deep packet nspection with an open source UTM system. A security level based protection policy is proposed for simplifying the security rule management for vCNSMS. Different security levels have different packet inspection schemes and are enforced with different security plugins. A smart packet verdict scheme is also integrated into vCNSMS for ntelligence flow processing to protect from possible network attacks inside a data center network
基金supported by the National Natural Science Foundation of China(Nos.61662057 and 61672143)。
文摘Given the complex nature of data centers’thermal management,which costs too many resources,processing time,and energy consumption,thermal awareness and thermal management powered by artificial intelligence(AI)are the targeted study.In addition to a few research on AI techniques and models,other strategies have also been introduced in recent years.Data center models,including cooling,thermal,power,and workload models,and their relationship are factors that need to be understood in the optimal thermal management system.Simulation approaches have been proposed to help validate new models or methods used for scheduling and consolidating processes and virtual machines(VMs),hotspot identification,thermal state estimation,and power usage change.AI-powered thermal optimization leads to improved process scheduling and consolidation of VMs and eliminates the hotspot from happening.At present,research on AI-powered thermal control is still in its infancy.This paper concludes with four issues in thermal management,which will be the scope of further research.
基金supported in part by National Natural Science Foundation of China (No. 61772286, No. 61802208)China Postdoctoral Science Foundation(No. 2019M651923)+2 种基金Natural Science Foundation of Jiangsu Province of China(No. BK20191381)Primary Research&Development Plan of Jiangsu Province(No. BE2019742)Natural Science Fund for Colleges and Universities in Jiangsu Province (No. 18KJB520036)。
文摘With the deteriorating effects resulting from global warming in many areas, geographically distributed data centers contribute greatly to carbon emissions, because the major energy supply is fossil fuels. Considering this issue, many geographically distributed data centers are attempting to use clean energy as their energy supply, such as fuel cells and renewable energy sources. However, not all workloads can be powered by a single power sources, since different workloads exhibit different characteristics. In this paper, we propose a fine-grained heterogeneous power distribution model with an objective of minimizing the total energy costs and the sum of the energy gap generated by the geographically distributed data centers powered by multiple types of energy resources. In order to achieve these two goals, we design a two-stage online algorithm to leverage the power supply of each energy source. In each time slot, we also consider a chance-constraint problem and use the Bernstein approximation to solve the problem. Finally, simulation results based on real-world traces illustrate that the proposed algorithm can achieve satisfactory performance.
基金supported by the National Natural Science Foundation of China (Nos. 61320106007, 61572129, 61502097, and 61370207)the National High-Tech Research and Development (863) Program of China (No. 2013AA013503)+4 种基金International S&T Cooperation Program of China (No. 2015DFA10490)Jiangsu research prospective joint research project (No. BY2013073-01)Jiangsu Provincial Key Laboratory of Network and Information Security (No. BM2003201)Key Laboratory of Computer Network and Information Integration of Ministry of Education of China (No. 93K-9)supported by Collaborative Innovation Center of Novel Software Technology and Industrialization and Collaborative Innovation Center of Wireless Communications Technology
文摘Recent developments in cloud computing and big data have spurred the emergence of data-intensive applications for which massive scientific datasets are stored in globally distributed scientific data centers that have a high frequency of data access by scientists worldwide. Multiple associated data items distributed in different scientific data centers may be requested for one data processing task, and data placement decisions must respect the storage capacity limits of the scientific data centers. Therefore, the optimization of data access cost in the placement of data items in globally distributed scientific data centers has become an increasingly important goal.Existing data placement approaches for geo-distributed data items are insufficient because they either cannot cope with the cost incurred by the associated data access, or they overlook storage capacity limitations, which are a very practical constraint of scientific data centers. In this paper, inspired by applications in the field of high energy physics, we propose an integer-programming-based data placement model that addresses the above challenges as a Non-deterministic Polynomial-time(NP)-hard problem. In addition we use a Lagrangian relaxation based heuristics algorithm to obtain ideal data placement solutions. Our simulation results demonstrate that our algorithm is effective and significantly reduces overall data access cost.
基金supported by the National Key Research and Development Program of China under Grant No.2018YFB1003502the National Natural Science Foundation of China under Grant Nos.62072195,61825202,61832006,and 61628204.
文摘With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes difficult.The multi-FPGA acceleration is of great necessity and importance.Many cloud providers(e.g.,Amazon,Microsoft,and Baidu)now expose FPGAs to users in their data centers,providing opportunities to accelerate large-scale graph processing.In this paper,we present a communication library,called FDGLib,which can easily scale out any existing single FPGA-based graph accelerator to a distributed version in a data center,with minimal hardware engineering efforts.FDGLib provides six APIs that can be easily used and integrated into any FPGA-based graph accelerator with only a few lines of code modifications.Considering the torus-based FPGA interconnection in data centers,FDGLib also improves communication efficiency using simple yet effective torus-friendly graph partition and placement schemes.We interface FDGLib into AccuGraph,a state-of-the-art graph accelerator.Our results on a 32-node Microsoft Catapult-like data center show that the distributed AccuGraph can be 2.32x and 4.77x faster than a state-of-the-art distributed FPGA-based graph accelerator ForeGraph and a distributed CPU-based graph system Gemini,with better scalability.
基金National Key Re-search and Development Program of China(2018YFB2101300)National Natural Science Foundation of China(61872147)+1 种基金Dean’s Fund of Engineering Research Center of Soft-ware/Hardware Co-design Technology and ApplicationMinistry of Edu-cation(East China Normal University)。
文摘To support the needs of ever-growing cloudbased services,the number of servers and network devices in data centers is increasing exponentially,which in turn results in high complexities and difficulties in network optimization.Machine learning(ML)provides an effective way to deal with these challenges by enabling network intelligence.To this end,numerous creative ML-based approaches have been put forward in recent years.Nevertheless,the intelligent optimization of data center networks(DCN)still faces enormous challenges.To the best of our knowledge,there is a lack of systematic and original investigations with in-depth analysis on intelligent DCN.To this end,in this paper,we investigate the application of ML to DCN optimization and provide a general overview and in-depth analysis of the recent works,covering flow prediction,flow classification,and resource management.Moreover,we also give unique insights into the technology evolution of the fusion of DCN and ML,together with some challenges and future research opportunities.
基金supported by National Key R&D Program of China—Industrial Internet Application Demonstration-Sub-topic Intelligent Network Operation and Security Protection(2018YFB1802400).
文摘The important issues of network TCP congestion control are how to compute the link price according to the link status and regulate the data sending rate based on link congestion pricing feedback information.However,it is difficult to predict the congestion state of the link-end accurately at the source.In this paper,we presented an improved NUMFabric algorithm for calculating the overall congestion price.In the proposed scheme,the whole network structure had been obtained by the central control server in the Software Defined Network,and a kind of dual-hierarchy algorithm for calculating overall network congestion price had been demonstrated.In this scheme,the first hierarchy algorithm was set up in a central control server like Opendaylight and the guiding parameter B is obtained based on the intelligent data of global link state information.Based on the historical data,the congestion state of the network and the guiding parameter B is accurately predicted by the machine learning algorithm.The second hierarchy algorithm was installed in the Openflow link and the link price was calculated based on guiding parameter B given by the first algorithm.We evaluate this evolved NUMFabric algorithm in NS3,which demonstrated that the proposed NUMFabric algorithm could efficiently increase the link bandwidth utilization of cloud computing IoT datacenters.
基金the National Science Foundation (NSF)of United States under Grant Nos.CNS 1757533, CNS 1629746,CNS 1564128,CNS 1449860,CNS 1461932,CNS 1460971,IIP 1439672,and CSC 20163100.
文摘With the growing popularity of cloud-based data center networks (DCNs),task resource allocation has become more and more important to the efficient use of resource in DCNs.This paper considers provisioning the maximum admissible load (MAL)of virtual machines (VMs)in physical machines (PMs)with underlying tree-structured DCNs using the hose model for communication.The limitation of static load distribution is that it assigns tasks to nodes in a once-and-for-all manner,and thus requires a priori knowledge of program behavior.To avoid load redistribution during runtime when the load grows,we introduce maximum elasticity scheduling,which has the maximum growth potential subject to the node and link capacities.This paper aims to find the schedule with the maximum elasticity across nodes and links.We first propose a distributed linear solution based on message passing,and we discuss several properties and extensions of the model.Based on the assumptions and conclusions,we extend it to the multiple paths case with a fat tree DCN,and discuss the optimal solution for computing the MAL with both computation and communication constraints.After that,we present the provision scheme with the maximum elasticity for the VMs,which comes with provable optimality guarantee for a fixed flow scheduling strategy in a fat tree DCN.We conduct the evaluations on our testbed and present various simulation results by comparing the proposed maximum elastic scheduling schemes with other methods.Extensive simulations validate the effectiveness of the proposed policies,and the results are shown from different perspectives to provide solutions based on our research.