The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections an...The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections and convergence.In this paper,with the optimization objective of maximizing network utility while ensuring flows performance-centric weighted fairness,this paper designs a reinforcement learning-based cloud-edge autonomous multi-domain data center network architecture that achieves single-domain autonomy and multi-domain collaboration.Due to the conflict between the utility of different flows,the bandwidth fairness allocation problem for various types of flows is formulated by considering different defined reward functions.Regarding the tradeoff between fairness and utility,this paper deals with the corresponding reward functions for the cases where the flows undergo abrupt changes and smooth changes in the flows.In addition,to accommodate the Quality of Service(QoS)requirements for multiple types of flows,this paper proposes a multi-domain autonomous routing algorithm called LSTM+MADDPG.Introducing a Long Short-Term Memory(LSTM)layer in the actor and critic networks,more information about temporal continuity is added,further enhancing the adaptive ability changes in the dynamic network environment.The LSTM+MADDPG algorithm is compared with the latest reinforcement learning algorithm by conducting experiments on real network topology and traffic traces,and the experimental results show that LSTM+MADDPG improves the delay convergence speed by 14.6%and delays the start moment of packet loss by 18.2%compared with other algorithms.展开更多
Cloud Datacenter Network(CDN)providers usually have the option to scale their network structures to allow for far more resource capacities,though such scaling options may come with exponential costs that contradict th...Cloud Datacenter Network(CDN)providers usually have the option to scale their network structures to allow for far more resource capacities,though such scaling options may come with exponential costs that contradict their utility objectives.Yet,besides the cost of the physical assets and network resources,such scaling may also imposemore loads on the electricity power grids to feed the added nodes with the required energy to run and cool,which comes with extra costs too.Thus,those CDNproviders who utilize their resources better can certainly afford their services at lower price-units when compared to others who simply choose the scaling solutions.Resource utilization is a quite challenging process;indeed,clients of CDNs usually tend to exaggerate their true resource requirements when they lease their resources.Service providers are committed to their clients with Service Level Agreements(SLAs).Therefore,any amendment to the resource allocations needs to be approved by the clients first.In this work,we propose deploying a Stackelberg leadership framework to formulate a negotiation game between the cloud service providers and their client tenants.Through this,the providers seek to retrieve those leased unused resources from their clients.Cooperation is not expected from the clients,and they may ask high price units to return their extra resources to the provider’s premises.Hence,to motivate cooperation in such a non-cooperative game,as an extension to theVickery auctions,we developed an incentive-compatible pricingmodel for the returned resources.Moreover,we also proposed building a behavior belief function that shapes the way of negotiation and compensation for each client.Compared to other benchmark models,the assessment results showthat our proposed models provide for timely negotiation schemes,allowing for better resource utilization rates,higher utilities,and grid-friend CDNs.展开更多
With the continuous expansion of the data center network scale, changing network requirements, and increasing pressure on network bandwidth, the traditional network architecture can no longer meet people’s needs. The...With the continuous expansion of the data center network scale, changing network requirements, and increasing pressure on network bandwidth, the traditional network architecture can no longer meet people’s needs. The development of software defined networks has brought new opportunities and challenges to future networks. The data and control separation characteristics of SDN improve the performance of the entire network. Researchers have integrated SDN architecture into data centers to improve network resource utilization and performance. This paper first introduces the basic concepts of SDN and data center networks. Then it discusses SDN-based load balancing mechanisms for data centers from different perspectives. Finally, it summarizes and looks forward to the study on SDN-based load balancing mechanisms and its development trend.展开更多
As the amount of data continues to grow rapidly,the variety of data produced by applications is becoming more affluent than ever.Cloud computing is the best technology evolving today to provide multi-services for the ...As the amount of data continues to grow rapidly,the variety of data produced by applications is becoming more affluent than ever.Cloud computing is the best technology evolving today to provide multi-services for the mass and variety of data.The cloud computing features are capable of processing,managing,and storing all sorts of data.Although data is stored in many high-end nodes,either in the same data centers or across many data centers in cloud,performance issues are still inevitable.The cloud replication strategy is one of best solutions to address risk of performance degradation in the cloud environment.The real challenge here is developing the right data replication strategy with minimal data movement that guarantees efficient network usage,low fault tolerance,and minimal replication frequency.The key problem discussed in this research is inefficient network usage discovered during selecting a suitable data center to store replica copies induced by inadequate data center selection criteria.Hence,to mitigate the issue,we proposed Replication Strategy with a comprehensive Data Center Selection Method(RS-DCSM),which can determine the appropriate data center to place replicas by considering three key factors:Popularity,space availability,and centrality.The proposed RS-DCSM was simulated using CloudSim and the results proved that data movement between data centers is significantly reduced by 14%reduction in overall replication frequency and 20%decrement in network usage,which outperformed the current replication strategy,known as Dynamic Popularity aware Replication Strategy(DPRS)algorithm.展开更多
According to Cisco’s Internet Report 2020 white paper,there will be 29.3 billion connected devices worldwide by 2023,up from 18.4 billion in 2018.5G connections will generate nearly three times more traffic than 4G c...According to Cisco’s Internet Report 2020 white paper,there will be 29.3 billion connected devices worldwide by 2023,up from 18.4 billion in 2018.5G connections will generate nearly three times more traffic than 4G connections.While bringing a boom to the network,it also presents unprecedented challenges in terms of flow forwarding decisions.The path assignment mechanism used in traditional traffic schedulingmethods tends to cause local network congestion caused by the concentration of elephant flows,resulting in unbalanced network load and degraded quality of service.Using the centralized control of software-defined networks,this study proposes a data center traffic scheduling strategy for minimization congestion and quality of service guaranteeing(MCQG).The ideal transmission path is selected for data flows while considering the network congestion rate and quality of service.Different traffic scheduling strategies are used according to the characteristics of different service types in data centers.Reroute scheduling for elephant flows that tend to cause local congestion.The path evaluation function is formed by the maximum link utilization on the path,the number of elephant flows and the time delay,and the fast merit-seeking capability of the sparrow search algorithm is used to find the path with the lowest actual link overhead as the rerouting path for the elephant flows.It is used to reduce the possibility of local network congestion occurrence.Equal cost multi-path(ECMP)protocols with faster response time are used to schedulemouse flows with shorter duration.Used to guarantee the quality of service of the network.To achieve isolated transmission of various types of data streams.The experimental results show that the proposed strategy has higher throughput,better network load balancing,and better robustness compared to ECMP under different traffic models.In addition,because it can fully utilize the resources in the network,MCQG also outperforms another traffic scheduling strategy that does rerouting for elephant flows(namely Hedera).Compared withECMPandHedera,MCQGimproves average throughput by 11.73%and 4.29%,and normalized total throughput by 6.74%and 2.64%,respectively;MCQG improves link utilization by 23.25%and 15.07%;in addition,the average round-trip delay and packet loss rate fluctuate significantly less than the two compared strategies.展开更多
Data centers are being distributed worldwide by cloud service providers(CSPs)to save energy costs through efficient workload alloca-tion strategies.Many CSPs are challenged by the significant rise in user demands due ...Data centers are being distributed worldwide by cloud service providers(CSPs)to save energy costs through efficient workload alloca-tion strategies.Many CSPs are challenged by the significant rise in user demands due to their extensive energy consumption during workload pro-cessing.Numerous research studies have examined distinct operating cost mitigation techniques for geo-distributed data centers(DCs).However,oper-ating cost savings during workload processing,which also considers string-matching techniques in geo-distributed DCs,remains unexplored.In this research,we propose a novel string matching-based geographical load balanc-ing(SMGLB)technique to mitigate the operating cost of the geo-distributed DC.The primary goal of this study is to use a string-matching algorithm(i.e.,Boyer Moore)to compare the contents of incoming workloads to those of documents that have already been processed in a data center.A successful match prevents the global load balancer from sending the user’s request to a data center for processing and displaying the results of the previously processed workload to the user to save energy.On the contrary,if no match can be discovered,the global load balancer will allocate the incoming workload to a specific DC for processing considering variable energy prices,the number of active servers,on-site green energy,and traces of incoming workload.The results of numerical evaluations show that the SMGLB can minimize the operating expenses of the geo-distributed data centers more than the existing workload distribution techniques.展开更多
In the Ethernet lossless Data Center Networks (DCNs) deployedwith Priority-based Flow Control (PFC), the head-of-line blocking problemis still difficult to prevent due to PFC triggering under burst trafficscenarios ev...In the Ethernet lossless Data Center Networks (DCNs) deployedwith Priority-based Flow Control (PFC), the head-of-line blocking problemis still difficult to prevent due to PFC triggering under burst trafficscenarios even with the existing congestion control solutions. To addressthe head-of-line blocking problem of PFC, we propose a new congestioncontrol mechanism. The key point of Congestion Control Using In-NetworkTelemetry for Lossless Datacenters (ICC) is to use In-Network Telemetry(INT) technology to obtain comprehensive congestion information, which isthen fed back to the sender to adjust the sending rate timely and accurately.It is possible to control congestion in time, converge to the target rate quickly,and maintain a near-zero queue length at the switch when using ICC. Weconducted Network Simulator-3 (NS-3) simulation experiments to test theICC’s performance. When compared to Congestion Control for Large-ScaleRDMA Deployments (DCQCN), TIMELY: RTT-based Congestion Controlfor the Datacenter (TIMELY), and Re-architecting Congestion Managementin Lossless Ethernet (PCN), ICC effectively reduces PFC pause messages andFlow Completion Time (FCT) by 47%, 56%, 34%, and 15.3×, 14.8×, and11.2×, respectively.展开更多
As a critical infrastructure of cloud computing,data center networks(DCNs)directly determine the service performance of data centers,which provide computing services for various applications such as big data processin...As a critical infrastructure of cloud computing,data center networks(DCNs)directly determine the service performance of data centers,which provide computing services for various applications such as big data processing and artificial intelligence.However,current architectures of data center networks suffer from a long routing path and a low fault tolerance between source and destination servers,which is hard to satisfy the requirements of high-performance data center networks.Based on dual-port servers and Clos network structure,this paper proposed a novel architecture RClos to construct high-performance data center networks.Logically,the proposed architecture is constructed by inserting a dual-port server into each pair of adjacent switches in the fabric of switches,where switches are connected in the form of a ring Clos structure.We describe the structural properties of RClos in terms of network scale,bisection bandwidth,and network diameter.RClos architecture inherits characteristics of its embedded Clos network,which can accommodate a large number of servers with a small average path length.The proposed architecture embraces a high fault tolerance,which adapts to the construction of various data center networks.For example,the average path length between servers is 3.44,and the standardized bisection bandwidth is 0.8 in RClos(32,5).The result of numerical experiments shows that RClos enjoys a small average path length and a high network fault tolerance,which is essential in the construction of high-performance data center networks.展开更多
Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with s...Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with sufficient resources to facilitate efficient network utilization and minimize delays.In such dynamic networks,links frequently fail or get congested,making the recalculation of the shortest paths a computationally intensive problem.Various routing protocols were proposed to overcome this problem by focusing on network utilization rather than speed.Surprisingly,the design of fast shortest-path algorithms for data centers was largely neglected,though they are universal components of routing protocols.Moreover,parallelization techniques were mostly deployed for random network topologies,and not for regular topologies that are often found in data centers.The aim of this paper is to improve scalability and reduce the time required for the shortest-path calculation in data center networks by parallelization on general-purpose hardware.We propose a novel algorithm that parallelizes edge relaxations as a faster and more scalable solution for popular data center topologies.展开更多
Globally,digital technology and the digital economy have propelled technological revolution and industrial change,and it has become one of the main grounds of international industrial competition.It was estimated that...Globally,digital technology and the digital economy have propelled technological revolution and industrial change,and it has become one of the main grounds of international industrial competition.It was estimated that the scale of China’s digital economy would reach 50 trillion yuan in 2022,accounting for more than 40%of GDP,presenting great market potential and room for the growth of the digital economy.With the rapid development of the digital economy,the state attaches great importance to the construction of digital infrastructure and has introduced a series of policies to promote the systematic development and large-scale deployment of digital infrastructure.In 2022 the Chinese government planned to build 8 arithmetic hubs and 10 national data center clusters nationwide.To proactively address the future demand for AI across various scenarios,there is a need for a well-structured computing power infrastructure.The data center,serving as the pivotal hub for computing power,has evolved from the conventional cloud center to a more intelligent computing center,allowing for a diversified convergence of computing power supply.Besides,the data center accommodates a diverse array of arithmetic business forms from customers,reflecting the multi-industry developmental trend.The arithmetic service platform is consistently broadening its scope,with ongoing optimization and innovation in the design scheme of machine room processes.The widespread application of submerged phase-change liquid cooling technology and cold plate cooling technology introduces a series of new challenges to the construction of digital infrastructure.This paper delves into the design objectives,industry considerations,layout,and other dimensions of a smart computing center and proposes a new-generation data center solution that is“flexible,resilient,green,and low-carbon.”展开更多
How to effectively reduce the energy consumption of large-scale data centers is a key issue in cloud computing. This paper presents a novel low-power task scheduling algorithm (L3SA) for large-scale cloud data cente...How to effectively reduce the energy consumption of large-scale data centers is a key issue in cloud computing. This paper presents a novel low-power task scheduling algorithm (L3SA) for large-scale cloud data centers. The winner tree is introduced to make the data nodes as the leaf nodes of the tree and the final winner on the purpose of reducing energy consumption is selected. The complexity of large-scale cloud data centers is fully consider, and the task comparson coefficient is defined to make task scheduling strategy more reasonable. Experiments and performance analysis show that the proposed algorithm can effectively improve the node utilization, and reduce the overall power consumption of the cloud data center.展开更多
In recent years,dual-homed topologies have appeared in data centers in order to offer higher aggregate bandwidth by using multiple paths simultaneously.Multipath TCP(MPTCP) has been proposed as a replacement for TCP i...In recent years,dual-homed topologies have appeared in data centers in order to offer higher aggregate bandwidth by using multiple paths simultaneously.Multipath TCP(MPTCP) has been proposed as a replacement for TCP in those topologies as it can efficiently offer improved throughput and better fairness.However,we have found that MPTCP has a problem in terms of incast collapse where the receiver suffers a drastic goodput drop when it simultaneously requests data over multiple servers.In this paper,we investigate why the goodput collapses even if MPTCP is able to actively relieve hot spots.In order to address the problem,we propose an equally-weighted congestion control algorithm for MPTCP,namely EW-MPTCP,without need for centralized control,additional infrastructure and a hardware upgrade.In our scheme,in addition to the coupled congestion control performed on each subflow of an MPTCP connection,we allow each subflow to perform an additional congestion control operation by weighting the congestion window in reverse proportion to the number of servers.The goal is to mitigate incast collapse by allowing multiple MPTCP subflows to compete fairly with a single-TCP flow at the shared bottleneck.The simulation results show that our solution mitigates the incast problem and noticeably improves goodput in data centers.展开更多
With the rapid development of technologies such as big data and cloud computing,data communication and data computing in the form of exponential growth have led to a large amount of energy consumption in data centers....With the rapid development of technologies such as big data and cloud computing,data communication and data computing in the form of exponential growth have led to a large amount of energy consumption in data centers.Globally,data centers will become the world’s largest users of energy consumption,with the ratio rising from 3%in 2017 to 4.5%in 2025.Due to its unique climate and energy-saving advantages,the high-latitude area in the Pan-Arctic region has gradually become a hotspot for data center site selection in recent years.In order to predict and analyze the future energy consumption and carbon emissions of global data centers,this paper presents a new method based on global data center traffic and power usage effectiveness(PUE)for energy consumption prediction.Firstly,global data center traffic growth is predicted based on the Cisco’s research.Secondly,the dynamic global average PUE and the high latitude PUE based on Romonet simulation model are obtained,and then global data center energy consumption with two different scenarios,the decentralized scenario and the centralized scenario,is analyzed quantitatively via the polynomial fitting method.The simulation results show that,in 2030,the global data center energy consumption and carbon emissions are reduced by about 301 billion kWh and 720 million tons CO2 in the centralized scenario compared with that of the decentralized scenario,which confirms that the establishment of data centers in the Pan-Arctic region in the future can effectively relief the climate change and energy problems.This study provides support for global energy consumption prediction,and guidance for the layout of future global data centers from the perspective of energy consumption.Moreover,it provides support of the feasibility of the integration of energy and information networks under the Global Energy Interconnection conception.展开更多
With the emerging diverse applications in data centers,the demands on quality of service in data centers also become diverse,such as high throughput of elephant flows and low latency of deadline-sensitive flows.Howeve...With the emerging diverse applications in data centers,the demands on quality of service in data centers also become diverse,such as high throughput of elephant flows and low latency of deadline-sensitive flows.However,traditional TCPs are ill-suited to such situations and always result in the inefficiency(e.g.missing the flow deadline,inevitable throughput collapse)of data transfers.This further degrades the user-perceived quality of service(QoS)in data centers.To reduce the flow completion time of mice and deadline-sensitive flows along with promoting the throughput of elephant flows,an efficient and deadline-aware priority-driven congestion control(PCC)protocol,which grants mice and deadline-sensitive flows the highest priority,is proposed in this paper.Specifically,PCC computes the priority of different flows according to the size of transmitted data,the remaining data volume,and the flows’deadline.Then PCC adjusts the congestion window according to the flow priority and the degree of network congestion.Furthermore,switches in data centers control the input/output of packets based on the flow priority and the queue length.Different from existing TCPs,to speed up the data transfers of mice and deadline-sensitive flows,PCC provides an effective method to compute and encode the flow priority explicitly.According to the flow priority,switches can manage packets efficiently and ensure the data transfers of high priority flows through a weighted priority scheduling with minor modification.The experimental results prove that PCC can improve the data transfer performance of mice and deadline-sensitive flows while guaranting the throughput of elephant flows.展开更多
Virtual data center is a new form of cloud computing concept applied to data center. As one of the most important challenges, virtual data center embedding problem has attracted much attention from researchers. In dat...Virtual data center is a new form of cloud computing concept applied to data center. As one of the most important challenges, virtual data center embedding problem has attracted much attention from researchers. In data centers, energy issue is very important for the reality that data center energy consumption has increased by dozens of times in the last decade. In this paper, we are concerned about the cost-aware multi-domain virtual data center embedding problem. In order to solve this problem, this paper first addresses the energy consumption model. The model includes the energy consumption model of the virtual machine node and the virtual switch node, to quantify the energy consumption in the virtual data center embedding process. Based on the energy consumption model above, this paper presents a heuristic algorithm for cost-aware multi-domain virtual data center embedding. The algorithm consists of two steps: inter-domain embedding and intra-domain embedding. Inter-domain virtual data center embedding refers to dividing virtual data center requests into several slices to select the appropriate single data center. Intra-domain virtual data center refers to embedding virtual data center requests in each data center. We first propose an inter-domain virtual data center embedding algorithm based on label propagation to select the appropriate single data center. We then propose a cost-aware virtual data center embedding algorithm to perform the intra-domain data center embedding. Extensive simulation results show that our proposed algorithm in this paper can effectively reduce the energy consumption while ensuring the success ratio of embedding.展开更多
Virtualization is a common technology for resource sharing in data center. To make efficient use of data center resources, the key challenge is to map customer demands (modeled as virtual data center, VDC) to the ph...Virtualization is a common technology for resource sharing in data center. To make efficient use of data center resources, the key challenge is to map customer demands (modeled as virtual data center, VDC) to the physical data center effectively. In this paper, we focus on this problem. Distinct with previous works, our study of VDC embedding problem is under the assumption that switch resource is the bottleneck of data center networks (DCNs). To this end, we not only propose relative cost to evaluate embedding strategy, decouple embedding problem into VM placement with marginal resource assignment and virtual link mapping with decided source-destination based on the property of fat-tree, but also design the traffic aware embedding algorithm (TAE) and first fit virtual link mapping (FFLM) to map virtual data center requests to a physical data center. Simulation results show that TAE+FFLM could increase acceptance rate and reduce network cost (about 49% in the case) at the same time. The traffie aware embedding algorithm reduces the load of core-link traffic and brings the optimization opportunity for data center network energy conservation.展开更多
We consider differentiated timecritical task scheduling in a N×N input queued optical packet s w itch to ens ure 100% throughput and meet different delay requirements among various modules of data center. Existin...We consider differentiated timecritical task scheduling in a N×N input queued optical packet s w itch to ens ure 100% throughput and meet different delay requirements among various modules of data center. Existing schemes either consider slot-by-slot scheduling with queue depth serving as the delay metric or assume that each input-output connection has the same delay bound in the batch scheduling mode. The former scheme neglects the effect of reconfiguration overhead, which may result in crippled system performance, while the latter cannot satisfy users' differentiated Quality of Service(Qo S) requirements. To make up these deficiencies, we propose a new batch scheduling scheme to meet the various portto-port delay requirements in a best-effort manner. Moreover, a speedup is considered to compensate for both the reconfiguration overhead and the unavoidable slots wastage in the switch fabric. With traffic matrix and delay constraint matrix given, this paper proposes two heuristic algorithms Stringent Delay First(SDF) and m-order SDF(m-SDF) to realize the 100% packet switching, while maximizing the delay constraints satisfaction ratio. The performance of our scheme is verified by extensive numerical simulations.展开更多
The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which...The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which may fail and affect the quality of service.Failure prediction is an important means of ensuring service availability.Predicting node failure in cloud-based data centers is challenging because the failure symptoms reflected have complex characteristics,and the distribution imbalance between the failure sample and the normal sample is widespread,resulting in inaccurate failure prediction.Targeting these challenges,this paper proposes a novel failure prediction method FP-STE(Failure Prediction based on Spatio-temporal Feature Extraction).Firstly,an improved recurrent neural network HW-GRU(Improved GRU based on HighWay network)and a convolutional neural network CNN are used to extract the temporal features and spatial features of multivariate data respectively to increase the discrimination of different types of failure symptoms which improves the accuracy of prediction.Then the intermediate results of the two models are added as features into SCSXGBoost to predict the possibility and the precise type of node failure in the future.SCS-XGBoost is an ensemble learning model that is improved by the integrated strategy of oversampling and cost-sensitive learning.Experimental results based on real data sets confirm the effectiveness and superiority of FP-STE.展开更多
Resource Scheduling is crucial to data centers. However, most previous works focus only on one-dimensional resource models which ignoring the fact that multiple resources simultaneously utilized, including CPU, memory...Resource Scheduling is crucial to data centers. However, most previous works focus only on one-dimensional resource models which ignoring the fact that multiple resources simultaneously utilized, including CPU, memory and network bandwidth. As cloud computing allows uncoordinated and heterogeneous users to share a data center, competition for multiple resources has become increasingly severe. Motivated by the differences on integrated utilization obtained from different packing schemes, in this paper we take the scheduling problem as a multi-dimensional combinatorial optimization problem with constraint satisfaction. With NP hardness, we present Multiple attribute decision based Integrated Resource Scheduling (MIRS), and a novel heuristic algorithm to gain the approximate optimal solution. Refers to simulation results, in face of various workload sets, our algorithm has significant superiorities in terms of efficiency and performance compared with previous methods.展开更多
1 Introduction The history of data centers can be traced back to the 1960s. Early data centers were deployed on main- frames that were time-shared by users via remote terminals. The boom in data centers came duringthe...1 Introduction The history of data centers can be traced back to the 1960s. Early data centers were deployed on main- frames that were time-shared by users via remote terminals. The boom in data centers came duringthe internet era. Many companies started building large inter- net-connected facililies,展开更多
文摘The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections and convergence.In this paper,with the optimization objective of maximizing network utility while ensuring flows performance-centric weighted fairness,this paper designs a reinforcement learning-based cloud-edge autonomous multi-domain data center network architecture that achieves single-domain autonomy and multi-domain collaboration.Due to the conflict between the utility of different flows,the bandwidth fairness allocation problem for various types of flows is formulated by considering different defined reward functions.Regarding the tradeoff between fairness and utility,this paper deals with the corresponding reward functions for the cases where the flows undergo abrupt changes and smooth changes in the flows.In addition,to accommodate the Quality of Service(QoS)requirements for multiple types of flows,this paper proposes a multi-domain autonomous routing algorithm called LSTM+MADDPG.Introducing a Long Short-Term Memory(LSTM)layer in the actor and critic networks,more information about temporal continuity is added,further enhancing the adaptive ability changes in the dynamic network environment.The LSTM+MADDPG algorithm is compared with the latest reinforcement learning algorithm by conducting experiments on real network topology and traffic traces,and the experimental results show that LSTM+MADDPG improves the delay convergence speed by 14.6%and delays the start moment of packet loss by 18.2%compared with other algorithms.
基金The Deanship of Scientific Research at Hashemite University partially funds this workDeanship of Scientific Research at the Northern Border University,Arar,KSA for funding this research work through the project number“NBU-FFR-2024-1580-08”.
文摘Cloud Datacenter Network(CDN)providers usually have the option to scale their network structures to allow for far more resource capacities,though such scaling options may come with exponential costs that contradict their utility objectives.Yet,besides the cost of the physical assets and network resources,such scaling may also imposemore loads on the electricity power grids to feed the added nodes with the required energy to run and cool,which comes with extra costs too.Thus,those CDNproviders who utilize their resources better can certainly afford their services at lower price-units when compared to others who simply choose the scaling solutions.Resource utilization is a quite challenging process;indeed,clients of CDNs usually tend to exaggerate their true resource requirements when they lease their resources.Service providers are committed to their clients with Service Level Agreements(SLAs).Therefore,any amendment to the resource allocations needs to be approved by the clients first.In this work,we propose deploying a Stackelberg leadership framework to formulate a negotiation game between the cloud service providers and their client tenants.Through this,the providers seek to retrieve those leased unused resources from their clients.Cooperation is not expected from the clients,and they may ask high price units to return their extra resources to the provider’s premises.Hence,to motivate cooperation in such a non-cooperative game,as an extension to theVickery auctions,we developed an incentive-compatible pricingmodel for the returned resources.Moreover,we also proposed building a behavior belief function that shapes the way of negotiation and compensation for each client.Compared to other benchmark models,the assessment results showthat our proposed models provide for timely negotiation schemes,allowing for better resource utilization rates,higher utilities,and grid-friend CDNs.
文摘With the continuous expansion of the data center network scale, changing network requirements, and increasing pressure on network bandwidth, the traditional network architecture can no longer meet people’s needs. The development of software defined networks has brought new opportunities and challenges to future networks. The data and control separation characteristics of SDN improve the performance of the entire network. Researchers have integrated SDN architecture into data centers to improve network resource utilization and performance. This paper first introduces the basic concepts of SDN and data center networks. Then it discusses SDN-based load balancing mechanisms for data centers from different perspectives. Finally, it summarizes and looks forward to the study on SDN-based load balancing mechanisms and its development trend.
基金supported by Universiti Putra Malaysia and the Ministry of Education(MOE).
文摘As the amount of data continues to grow rapidly,the variety of data produced by applications is becoming more affluent than ever.Cloud computing is the best technology evolving today to provide multi-services for the mass and variety of data.The cloud computing features are capable of processing,managing,and storing all sorts of data.Although data is stored in many high-end nodes,either in the same data centers or across many data centers in cloud,performance issues are still inevitable.The cloud replication strategy is one of best solutions to address risk of performance degradation in the cloud environment.The real challenge here is developing the right data replication strategy with minimal data movement that guarantees efficient network usage,low fault tolerance,and minimal replication frequency.The key problem discussed in this research is inefficient network usage discovered during selecting a suitable data center to store replica copies induced by inadequate data center selection criteria.Hence,to mitigate the issue,we proposed Replication Strategy with a comprehensive Data Center Selection Method(RS-DCSM),which can determine the appropriate data center to place replicas by considering three key factors:Popularity,space availability,and centrality.The proposed RS-DCSM was simulated using CloudSim and the results proved that data movement between data centers is significantly reduced by 14%reduction in overall replication frequency and 20%decrement in network usage,which outperformed the current replication strategy,known as Dynamic Popularity aware Replication Strategy(DPRS)algorithm.
基金This work is funded by the National Natural Science Foundation of China under Grant No.61772180the Key R&D plan of Hubei Province(2020BHB004,2020BAB012).
文摘According to Cisco’s Internet Report 2020 white paper,there will be 29.3 billion connected devices worldwide by 2023,up from 18.4 billion in 2018.5G connections will generate nearly three times more traffic than 4G connections.While bringing a boom to the network,it also presents unprecedented challenges in terms of flow forwarding decisions.The path assignment mechanism used in traditional traffic schedulingmethods tends to cause local network congestion caused by the concentration of elephant flows,resulting in unbalanced network load and degraded quality of service.Using the centralized control of software-defined networks,this study proposes a data center traffic scheduling strategy for minimization congestion and quality of service guaranteeing(MCQG).The ideal transmission path is selected for data flows while considering the network congestion rate and quality of service.Different traffic scheduling strategies are used according to the characteristics of different service types in data centers.Reroute scheduling for elephant flows that tend to cause local congestion.The path evaluation function is formed by the maximum link utilization on the path,the number of elephant flows and the time delay,and the fast merit-seeking capability of the sparrow search algorithm is used to find the path with the lowest actual link overhead as the rerouting path for the elephant flows.It is used to reduce the possibility of local network congestion occurrence.Equal cost multi-path(ECMP)protocols with faster response time are used to schedulemouse flows with shorter duration.Used to guarantee the quality of service of the network.To achieve isolated transmission of various types of data streams.The experimental results show that the proposed strategy has higher throughput,better network load balancing,and better robustness compared to ECMP under different traffic models.In addition,because it can fully utilize the resources in the network,MCQG also outperforms another traffic scheduling strategy that does rerouting for elephant flows(namely Hedera).Compared withECMPandHedera,MCQGimproves average throughput by 11.73%and 4.29%,and normalized total throughput by 6.74%and 2.64%,respectively;MCQG improves link utilization by 23.25%and 15.07%;in addition,the average round-trip delay and packet loss rate fluctuate significantly less than the two compared strategies.
文摘Data centers are being distributed worldwide by cloud service providers(CSPs)to save energy costs through efficient workload alloca-tion strategies.Many CSPs are challenged by the significant rise in user demands due to their extensive energy consumption during workload pro-cessing.Numerous research studies have examined distinct operating cost mitigation techniques for geo-distributed data centers(DCs).However,oper-ating cost savings during workload processing,which also considers string-matching techniques in geo-distributed DCs,remains unexplored.In this research,we propose a novel string matching-based geographical load balanc-ing(SMGLB)technique to mitigate the operating cost of the geo-distributed DC.The primary goal of this study is to use a string-matching algorithm(i.e.,Boyer Moore)to compare the contents of incoming workloads to those of documents that have already been processed in a data center.A successful match prevents the global load balancer from sending the user’s request to a data center for processing and displaying the results of the previously processed workload to the user to save energy.On the contrary,if no match can be discovered,the global load balancer will allocate the incoming workload to a specific DC for processing considering variable energy prices,the number of active servers,on-site green energy,and traces of incoming workload.The results of numerical evaluations show that the SMGLB can minimize the operating expenses of the geo-distributed data centers more than the existing workload distribution techniques.
基金supported by the National Natural Science Foundation of China (No.62102046,62072249,62072056)JinWang,YongjunRen,and Jinbin Hu receive the grant,and the URLs to the sponsors’websites are https://www.nsfc.gov.cn/.This work is also funded by the National Science Foundation of Hunan Province (No.2022JJ30618,2020JJ2029).
文摘In the Ethernet lossless Data Center Networks (DCNs) deployedwith Priority-based Flow Control (PFC), the head-of-line blocking problemis still difficult to prevent due to PFC triggering under burst trafficscenarios even with the existing congestion control solutions. To addressthe head-of-line blocking problem of PFC, we propose a new congestioncontrol mechanism. The key point of Congestion Control Using In-NetworkTelemetry for Lossless Datacenters (ICC) is to use In-Network Telemetry(INT) technology to obtain comprehensive congestion information, which isthen fed back to the sender to adjust the sending rate timely and accurately.It is possible to control congestion in time, converge to the target rate quickly,and maintain a near-zero queue length at the switch when using ICC. Weconducted Network Simulator-3 (NS-3) simulation experiments to test theICC’s performance. When compared to Congestion Control for Large-ScaleRDMA Deployments (DCQCN), TIMELY: RTT-based Congestion Controlfor the Datacenter (TIMELY), and Re-architecting Congestion Managementin Lossless Ethernet (PCN), ICC effectively reduces PFC pause messages andFlow Completion Time (FCT) by 47%, 56%, 34%, and 15.3×, 14.8×, and11.2×, respectively.
基金This work was supported by the Hainan Provincial Natural Science Foundation of China(620RC560,2019RC096,620RC562)the Scientific Research Setup Fund of Hainan University(KYQD(ZR)1877)+2 种基金the National Natural Science Foundation of China(62162021,82160345,61802092)the key research and development program of Hainan province(ZDYF2020199,ZDYF2021GXJS017)the key science and technology plan project of Haikou(2011-016).
文摘As a critical infrastructure of cloud computing,data center networks(DCNs)directly determine the service performance of data centers,which provide computing services for various applications such as big data processing and artificial intelligence.However,current architectures of data center networks suffer from a long routing path and a low fault tolerance between source and destination servers,which is hard to satisfy the requirements of high-performance data center networks.Based on dual-port servers and Clos network structure,this paper proposed a novel architecture RClos to construct high-performance data center networks.Logically,the proposed architecture is constructed by inserting a dual-port server into each pair of adjacent switches in the fabric of switches,where switches are connected in the form of a ring Clos structure.We describe the structural properties of RClos in terms of network scale,bisection bandwidth,and network diameter.RClos architecture inherits characteristics of its embedded Clos network,which can accommodate a large number of servers with a small average path length.The proposed architecture embraces a high fault tolerance,which adapts to the construction of various data center networks.For example,the average path length between servers is 3.44,and the standardized bisection bandwidth is 0.8 in RClos(32,5).The result of numerical experiments shows that RClos enjoys a small average path length and a high network fault tolerance,which is essential in the construction of high-performance data center networks.
基金This work was supported by the Serbian Ministry of Science and Education(project TR-32022)by companies Telekom Srbija and Informatika.
文摘Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with sufficient resources to facilitate efficient network utilization and minimize delays.In such dynamic networks,links frequently fail or get congested,making the recalculation of the shortest paths a computationally intensive problem.Various routing protocols were proposed to overcome this problem by focusing on network utilization rather than speed.Surprisingly,the design of fast shortest-path algorithms for data centers was largely neglected,though they are universal components of routing protocols.Moreover,parallelization techniques were mostly deployed for random network topologies,and not for regular topologies that are often found in data centers.The aim of this paper is to improve scalability and reduce the time required for the shortest-path calculation in data center networks by parallelization on general-purpose hardware.We propose a novel algorithm that parallelizes edge relaxations as a faster and more scalable solution for popular data center topologies.
文摘Globally,digital technology and the digital economy have propelled technological revolution and industrial change,and it has become one of the main grounds of international industrial competition.It was estimated that the scale of China’s digital economy would reach 50 trillion yuan in 2022,accounting for more than 40%of GDP,presenting great market potential and room for the growth of the digital economy.With the rapid development of the digital economy,the state attaches great importance to the construction of digital infrastructure and has introduced a series of policies to promote the systematic development and large-scale deployment of digital infrastructure.In 2022 the Chinese government planned to build 8 arithmetic hubs and 10 national data center clusters nationwide.To proactively address the future demand for AI across various scenarios,there is a need for a well-structured computing power infrastructure.The data center,serving as the pivotal hub for computing power,has evolved from the conventional cloud center to a more intelligent computing center,allowing for a diversified convergence of computing power supply.Besides,the data center accommodates a diverse array of arithmetic business forms from customers,reflecting the multi-industry developmental trend.The arithmetic service platform is consistently broadening its scope,with ongoing optimization and innovation in the design scheme of machine room processes.The widespread application of submerged phase-change liquid cooling technology and cold plate cooling technology introduces a series of new challenges to the construction of digital infrastructure.This paper delves into the design objectives,industry considerations,layout,and other dimensions of a smart computing center and proposes a new-generation data center solution that is“flexible,resilient,green,and low-carbon.”
基金supported by the National Natural Science Foundation of China(6120200461272084)+9 种基金the National Key Basic Research Program of China(973 Program)(2011CB302903)the Specialized Research Fund for the Doctoral Program of Higher Education(2009322312000120113223110003)the China Postdoctoral Science Foundation Funded Project(2011M5000952012T50514)the Natural Science Foundation of Jiangsu Province(BK2011754BK2009426)the Jiangsu Postdoctoral Science Foundation Funded Project(1102103C)the Natural Science Fund of Higher Education of Jiangsu Province(12KJB520007)the Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions(yx002001)
文摘How to effectively reduce the energy consumption of large-scale data centers is a key issue in cloud computing. This paper presents a novel low-power task scheduling algorithm (L3SA) for large-scale cloud data centers. The winner tree is introduced to make the data nodes as the leaf nodes of the tree and the final winner on the purpose of reducing energy consumption is selected. The complexity of large-scale cloud data centers is fully consider, and the task comparson coefficient is defined to make task scheduling strategy more reasonable. Experiments and performance analysis show that the proposed algorithm can effectively improve the node utilization, and reduce the overall power consumption of the cloud data center.
基金supported in part by the HUT Distributed and Mobile Cloud Systems research project and Tekes within the ITEA2 project 10014 EASI-CLOUDS
文摘In recent years,dual-homed topologies have appeared in data centers in order to offer higher aggregate bandwidth by using multiple paths simultaneously.Multipath TCP(MPTCP) has been proposed as a replacement for TCP in those topologies as it can efficiently offer improved throughput and better fairness.However,we have found that MPTCP has a problem in terms of incast collapse where the receiver suffers a drastic goodput drop when it simultaneously requests data over multiple servers.In this paper,we investigate why the goodput collapses even if MPTCP is able to actively relieve hot spots.In order to address the problem,we propose an equally-weighted congestion control algorithm for MPTCP,namely EW-MPTCP,without need for centralized control,additional infrastructure and a hardware upgrade.In our scheme,in addition to the coupled congestion control performed on each subflow of an MPTCP connection,we allow each subflow to perform an additional congestion control operation by weighting the congestion window in reverse proportion to the number of servers.The goal is to mitigate incast collapse by allowing multiple MPTCP subflows to compete fairly with a single-TCP flow at the shared bottleneck.The simulation results show that our solution mitigates the incast problem and noticeably improves goodput in data centers.
基金supported by National Natural Science Foundation of China(61472042)Corporation Science and Technology Program of Global Energy Interconnection Group Ltd.(GEIGC-D-[2018]024)
文摘With the rapid development of technologies such as big data and cloud computing,data communication and data computing in the form of exponential growth have led to a large amount of energy consumption in data centers.Globally,data centers will become the world’s largest users of energy consumption,with the ratio rising from 3%in 2017 to 4.5%in 2025.Due to its unique climate and energy-saving advantages,the high-latitude area in the Pan-Arctic region has gradually become a hotspot for data center site selection in recent years.In order to predict and analyze the future energy consumption and carbon emissions of global data centers,this paper presents a new method based on global data center traffic and power usage effectiveness(PUE)for energy consumption prediction.Firstly,global data center traffic growth is predicted based on the Cisco’s research.Secondly,the dynamic global average PUE and the high latitude PUE based on Romonet simulation model are obtained,and then global data center energy consumption with two different scenarios,the decentralized scenario and the centralized scenario,is analyzed quantitatively via the polynomial fitting method.The simulation results show that,in 2030,the global data center energy consumption and carbon emissions are reduced by about 301 billion kWh and 720 million tons CO2 in the centralized scenario compared with that of the decentralized scenario,which confirms that the establishment of data centers in the Pan-Arctic region in the future can effectively relief the climate change and energy problems.This study provides support for global energy consumption prediction,and guidance for the layout of future global data centers from the perspective of energy consumption.Moreover,it provides support of the feasibility of the integration of energy and information networks under the Global Energy Interconnection conception.
基金supported part by the National Natural Science Foundation of China(61601252,61801254)Public Technology Projects of Zhejiang Province(LG-G18F020007)+1 种基金Zhejiang Provincial Natural Science Foundation of China(LY20F020008,LY18F020011,LY20F010004)K.C.Wong Magna Fund in Ningbo University。
文摘With the emerging diverse applications in data centers,the demands on quality of service in data centers also become diverse,such as high throughput of elephant flows and low latency of deadline-sensitive flows.However,traditional TCPs are ill-suited to such situations and always result in the inefficiency(e.g.missing the flow deadline,inevitable throughput collapse)of data transfers.This further degrades the user-perceived quality of service(QoS)in data centers.To reduce the flow completion time of mice and deadline-sensitive flows along with promoting the throughput of elephant flows,an efficient and deadline-aware priority-driven congestion control(PCC)protocol,which grants mice and deadline-sensitive flows the highest priority,is proposed in this paper.Specifically,PCC computes the priority of different flows according to the size of transmitted data,the remaining data volume,and the flows’deadline.Then PCC adjusts the congestion window according to the flow priority and the degree of network congestion.Furthermore,switches in data centers control the input/output of packets based on the flow priority and the queue length.Different from existing TCPs,to speed up the data transfers of mice and deadline-sensitive flows,PCC provides an effective method to compute and encode the flow priority explicitly.According to the flow priority,switches can manage packets efficiently and ensure the data transfers of high priority flows through a weighted priority scheduling with minor modification.The experimental results prove that PCC can improve the data transfer performance of mice and deadline-sensitive flows while guaranting the throughput of elephant flows.
基金supported in part by the following funding agencies of China:National Natural Science Foundation under Grant 61602050 and U1534201National Key Research and Development Program of China under Grant 2016QY01W0200
文摘Virtual data center is a new form of cloud computing concept applied to data center. As one of the most important challenges, virtual data center embedding problem has attracted much attention from researchers. In data centers, energy issue is very important for the reality that data center energy consumption has increased by dozens of times in the last decade. In this paper, we are concerned about the cost-aware multi-domain virtual data center embedding problem. In order to solve this problem, this paper first addresses the energy consumption model. The model includes the energy consumption model of the virtual machine node and the virtual switch node, to quantify the energy consumption in the virtual data center embedding process. Based on the energy consumption model above, this paper presents a heuristic algorithm for cost-aware multi-domain virtual data center embedding. The algorithm consists of two steps: inter-domain embedding and intra-domain embedding. Inter-domain virtual data center embedding refers to dividing virtual data center requests into several slices to select the appropriate single data center. Intra-domain virtual data center refers to embedding virtual data center requests in each data center. We first propose an inter-domain virtual data center embedding algorithm based on label propagation to select the appropriate single data center. We then propose a cost-aware virtual data center embedding algorithm to perform the intra-domain data center embedding. Extensive simulation results show that our proposed algorithm in this paper can effectively reduce the energy consumption while ensuring the success ratio of embedding.
基金This research was partially supported by the National Grand Fundamental Research 973 Program of China under Grant (No. 2013CB329103), Natural Science Foundation of China grant (No. 61271171), the Fundamental Research Funds for the Central Universities (ZYGX2013J002, ZYGX2012J004, ZYGX2010J002, ZYGX2010J009), Guangdong Science and Technology Project (2012B090500003, 2012B091000163, 2012556031).
文摘Virtualization is a common technology for resource sharing in data center. To make efficient use of data center resources, the key challenge is to map customer demands (modeled as virtual data center, VDC) to the physical data center effectively. In this paper, we focus on this problem. Distinct with previous works, our study of VDC embedding problem is under the assumption that switch resource is the bottleneck of data center networks (DCNs). To this end, we not only propose relative cost to evaluate embedding strategy, decouple embedding problem into VM placement with marginal resource assignment and virtual link mapping with decided source-destination based on the property of fat-tree, but also design the traffic aware embedding algorithm (TAE) and first fit virtual link mapping (FFLM) to map virtual data center requests to a physical data center. Simulation results show that TAE+FFLM could increase acceptance rate and reduce network cost (about 49% in the case) at the same time. The traffie aware embedding algorithm reduces the load of core-link traffic and brings the optimization opportunity for data center network energy conservation.
基金supported by the Major State Basic Research Program of China (973 project No. 2013CB329301 and 2010CB327806)the Natural Science Fund of China (NSFC project No. 61372085, 61032003, 61271165 and 61202379)+1 种基金the Research Fund for the Doctoral Program of Higher Education of China (RFDP project No. 20120185110025, 20120185110030 and 20120032120041)supported by Tianjin Key Laboratory of Cognitive Computing and Application, School of Computer Science and Technology, Tianjin University, Tianjin, P. R. China
文摘We consider differentiated timecritical task scheduling in a N×N input queued optical packet s w itch to ens ure 100% throughput and meet different delay requirements among various modules of data center. Existing schemes either consider slot-by-slot scheduling with queue depth serving as the delay metric or assume that each input-output connection has the same delay bound in the batch scheduling mode. The former scheme neglects the effect of reconfiguration overhead, which may result in crippled system performance, while the latter cannot satisfy users' differentiated Quality of Service(Qo S) requirements. To make up these deficiencies, we propose a new batch scheduling scheme to meet the various portto-port delay requirements in a best-effort manner. Moreover, a speedup is considered to compensate for both the reconfiguration overhead and the unavoidable slots wastage in the switch fabric. With traffic matrix and delay constraint matrix given, this paper proposes two heuristic algorithms Stringent Delay First(SDF) and m-order SDF(m-SDF) to realize the 100% packet switching, while maximizing the delay constraints satisfaction ratio. The performance of our scheme is verified by extensive numerical simulations.
基金supported in part by National Key Research and Development Program of China(2019YFB2103200)NSFC(61672108),Open Subject Funds of Science and Technology on Information Transmission and Dissemination in Communication Networks Laboratory(SKX182010049)+1 种基金the Fundamental Research Funds for the Central Universities(5004193192019PTB-019)the Industrial Internet Innovation and Development Project 2018 of China.
文摘The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which may fail and affect the quality of service.Failure prediction is an important means of ensuring service availability.Predicting node failure in cloud-based data centers is challenging because the failure symptoms reflected have complex characteristics,and the distribution imbalance between the failure sample and the normal sample is widespread,resulting in inaccurate failure prediction.Targeting these challenges,this paper proposes a novel failure prediction method FP-STE(Failure Prediction based on Spatio-temporal Feature Extraction).Firstly,an improved recurrent neural network HW-GRU(Improved GRU based on HighWay network)and a convolutional neural network CNN are used to extract the temporal features and spatial features of multivariate data respectively to increase the discrimination of different types of failure symptoms which improves the accuracy of prediction.Then the intermediate results of the two models are added as features into SCSXGBoost to predict the possibility and the precise type of node failure in the future.SCS-XGBoost is an ensemble learning model that is improved by the integrated strategy of oversampling and cost-sensitive learning.Experimental results based on real data sets confirm the effectiveness and superiority of FP-STE.
基金supported in part by National Key Basic Research Program of China (973 program) under Grant No.2011CB302506Important National Science & Technology Specific Projects: Next-Generation Broadband Wireless Mobile Communications Network under Grant No.2011ZX03002-001-01Innovative Research Groups of the National Natural Science Foundation of China under Grant No.60821001
文摘Resource Scheduling is crucial to data centers. However, most previous works focus only on one-dimensional resource models which ignoring the fact that multiple resources simultaneously utilized, including CPU, memory and network bandwidth. As cloud computing allows uncoordinated and heterogeneous users to share a data center, competition for multiple resources has become increasingly severe. Motivated by the differences on integrated utilization obtained from different packing schemes, in this paper we take the scheduling problem as a multi-dimensional combinatorial optimization problem with constraint satisfaction. With NP hardness, we present Multiple attribute decision based Integrated Resource Scheduling (MIRS), and a novel heuristic algorithm to gain the approximate optimal solution. Refers to simulation results, in face of various workload sets, our algorithm has significant superiorities in terms of efficiency and performance compared with previous methods.
基金supported by the ZTE-BJTU Collaborative Research Program under Grant No. K11L00190the Fundamental Research Funds for the Central Universities under Grant No. K12JB00060
文摘1 Introduction The history of data centers can be traced back to the 1960s. Early data centers were deployed on main- frames that were time-shared by users via remote terminals. The boom in data centers came duringthe internet era. Many companies started building large inter- net-connected facililies,