The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections an...The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections and convergence.In this paper,with the optimization objective of maximizing network utility while ensuring flows performance-centric weighted fairness,this paper designs a reinforcement learning-based cloud-edge autonomous multi-domain data center network architecture that achieves single-domain autonomy and multi-domain collaboration.Due to the conflict between the utility of different flows,the bandwidth fairness allocation problem for various types of flows is formulated by considering different defined reward functions.Regarding the tradeoff between fairness and utility,this paper deals with the corresponding reward functions for the cases where the flows undergo abrupt changes and smooth changes in the flows.In addition,to accommodate the Quality of Service(QoS)requirements for multiple types of flows,this paper proposes a multi-domain autonomous routing algorithm called LSTM+MADDPG.Introducing a Long Short-Term Memory(LSTM)layer in the actor and critic networks,more information about temporal continuity is added,further enhancing the adaptive ability changes in the dynamic network environment.The LSTM+MADDPG algorithm is compared with the latest reinforcement learning algorithm by conducting experiments on real network topology and traffic traces,and the experimental results show that LSTM+MADDPG improves the delay convergence speed by 14.6%and delays the start moment of packet loss by 18.2%compared with other algorithms.展开更多
Cloud Datacenter Network(CDN)providers usually have the option to scale their network structures to allow for far more resource capacities,though such scaling options may come with exponential costs that contradict th...Cloud Datacenter Network(CDN)providers usually have the option to scale their network structures to allow for far more resource capacities,though such scaling options may come with exponential costs that contradict their utility objectives.Yet,besides the cost of the physical assets and network resources,such scaling may also imposemore loads on the electricity power grids to feed the added nodes with the required energy to run and cool,which comes with extra costs too.Thus,those CDNproviders who utilize their resources better can certainly afford their services at lower price-units when compared to others who simply choose the scaling solutions.Resource utilization is a quite challenging process;indeed,clients of CDNs usually tend to exaggerate their true resource requirements when they lease their resources.Service providers are committed to their clients with Service Level Agreements(SLAs).Therefore,any amendment to the resource allocations needs to be approved by the clients first.In this work,we propose deploying a Stackelberg leadership framework to formulate a negotiation game between the cloud service providers and their client tenants.Through this,the providers seek to retrieve those leased unused resources from their clients.Cooperation is not expected from the clients,and they may ask high price units to return their extra resources to the provider’s premises.Hence,to motivate cooperation in such a non-cooperative game,as an extension to theVickery auctions,we developed an incentive-compatible pricingmodel for the returned resources.Moreover,we also proposed building a behavior belief function that shapes the way of negotiation and compensation for each client.Compared to other benchmark models,the assessment results showthat our proposed models provide for timely negotiation schemes,allowing for better resource utilization rates,higher utilities,and grid-friend CDNs.展开更多
In a data center network (DCN), load balancing is required when servers transfer data on the same path. This is necessary to avoid congestion. Load balancing is challenged by the dynamic transferral of demands and c...In a data center network (DCN), load balancing is required when servers transfer data on the same path. This is necessary to avoid congestion. Load balancing is challenged by the dynamic transferral of demands and complex routing control. Because of the distributed nature of a traditional network, previous research on load balancing has mostly focused on improving the performance of the local network; thus, the load has not been optimally balanced across the entire network. In this paper, we propose a novel dynamic load-balancing algorithm for fat-tree. This algorithm avoids congestions to the great possible extent by searching for non-conflicting paths in a centralized way. We implement the algorithm in the popular software-defined networking architecture and evaluate the algorithm' s performance on the Mininet platform. The results show that our algorithm has higher bisection band- width than the traditional equal-cost multi-path load-balancing algorithm and thus more effectively avoids congestion.展开更多
With the continuous expansion of the data center network scale, changing network requirements, and increasing pressure on network bandwidth, the traditional network architecture can no longer meet people’s needs. The...With the continuous expansion of the data center network scale, changing network requirements, and increasing pressure on network bandwidth, the traditional network architecture can no longer meet people’s needs. The development of software defined networks has brought new opportunities and challenges to future networks. The data and control separation characteristics of SDN improve the performance of the entire network. Researchers have integrated SDN architecture into data centers to improve network resource utilization and performance. This paper first introduces the basic concepts of SDN and data center networks. Then it discusses SDN-based load balancing mechanisms for data centers from different perspectives. Finally, it summarizes and looks forward to the study on SDN-based load balancing mechanisms and its development trend.展开更多
Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with s...Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with sufficient resources to facilitate efficient network utilization and minimize delays.In such dynamic networks,links frequently fail or get congested,making the recalculation of the shortest paths a computationally intensive problem.Various routing protocols were proposed to overcome this problem by focusing on network utilization rather than speed.Surprisingly,the design of fast shortest-path algorithms for data centers was largely neglected,though they are universal components of routing protocols.Moreover,parallelization techniques were mostly deployed for random network topologies,and not for regular topologies that are often found in data centers.The aim of this paper is to improve scalability and reduce the time required for the shortest-path calculation in data center networks by parallelization on general-purpose hardware.We propose a novel algorithm that parallelizes edge relaxations as a faster and more scalable solution for popular data center topologies.展开更多
In the Ethernet lossless Data Center Networks (DCNs) deployedwith Priority-based Flow Control (PFC), the head-of-line blocking problemis still difficult to prevent due to PFC triggering under burst trafficscenarios ev...In the Ethernet lossless Data Center Networks (DCNs) deployedwith Priority-based Flow Control (PFC), the head-of-line blocking problemis still difficult to prevent due to PFC triggering under burst trafficscenarios even with the existing congestion control solutions. To addressthe head-of-line blocking problem of PFC, we propose a new congestioncontrol mechanism. The key point of Congestion Control Using In-NetworkTelemetry for Lossless Datacenters (ICC) is to use In-Network Telemetry(INT) technology to obtain comprehensive congestion information, which isthen fed back to the sender to adjust the sending rate timely and accurately.It is possible to control congestion in time, converge to the target rate quickly,and maintain a near-zero queue length at the switch when using ICC. Weconducted Network Simulator-3 (NS-3) simulation experiments to test theICC’s performance. When compared to Congestion Control for Large-ScaleRDMA Deployments (DCQCN), TIMELY: RTT-based Congestion Controlfor the Datacenter (TIMELY), and Re-architecting Congestion Managementin Lossless Ethernet (PCN), ICC effectively reduces PFC pause messages andFlow Completion Time (FCT) by 47%, 56%, 34%, and 15.3×, 14.8×, and11.2×, respectively.展开更多
As a critical infrastructure of cloud computing,data center networks(DCNs)directly determine the service performance of data centers,which provide computing services for various applications such as big data processin...As a critical infrastructure of cloud computing,data center networks(DCNs)directly determine the service performance of data centers,which provide computing services for various applications such as big data processing and artificial intelligence.However,current architectures of data center networks suffer from a long routing path and a low fault tolerance between source and destination servers,which is hard to satisfy the requirements of high-performance data center networks.Based on dual-port servers and Clos network structure,this paper proposed a novel architecture RClos to construct high-performance data center networks.Logically,the proposed architecture is constructed by inserting a dual-port server into each pair of adjacent switches in the fabric of switches,where switches are connected in the form of a ring Clos structure.We describe the structural properties of RClos in terms of network scale,bisection bandwidth,and network diameter.RClos architecture inherits characteristics of its embedded Clos network,which can accommodate a large number of servers with a small average path length.The proposed architecture embraces a high fault tolerance,which adapts to the construction of various data center networks.For example,the average path length between servers is 3.44,and the standardized bisection bandwidth is 0.8 in RClos(32,5).The result of numerical experiments shows that RClos enjoys a small average path length and a high network fault tolerance,which is essential in the construction of high-performance data center networks.展开更多
According to Cisco’s Internet Report 2020 white paper,there will be 29.3 billion connected devices worldwide by 2023,up from 18.4 billion in 2018.5G connections will generate nearly three times more traffic than 4G c...According to Cisco’s Internet Report 2020 white paper,there will be 29.3 billion connected devices worldwide by 2023,up from 18.4 billion in 2018.5G connections will generate nearly three times more traffic than 4G connections.While bringing a boom to the network,it also presents unprecedented challenges in terms of flow forwarding decisions.The path assignment mechanism used in traditional traffic schedulingmethods tends to cause local network congestion caused by the concentration of elephant flows,resulting in unbalanced network load and degraded quality of service.Using the centralized control of software-defined networks,this study proposes a data center traffic scheduling strategy for minimization congestion and quality of service guaranteeing(MCQG).The ideal transmission path is selected for data flows while considering the network congestion rate and quality of service.Different traffic scheduling strategies are used according to the characteristics of different service types in data centers.Reroute scheduling for elephant flows that tend to cause local congestion.The path evaluation function is formed by the maximum link utilization on the path,the number of elephant flows and the time delay,and the fast merit-seeking capability of the sparrow search algorithm is used to find the path with the lowest actual link overhead as the rerouting path for the elephant flows.It is used to reduce the possibility of local network congestion occurrence.Equal cost multi-path(ECMP)protocols with faster response time are used to schedulemouse flows with shorter duration.Used to guarantee the quality of service of the network.To achieve isolated transmission of various types of data streams.The experimental results show that the proposed strategy has higher throughput,better network load balancing,and better robustness compared to ECMP under different traffic models.In addition,because it can fully utilize the resources in the network,MCQG also outperforms another traffic scheduling strategy that does rerouting for elephant flows(namely Hedera).Compared withECMPandHedera,MCQGimproves average throughput by 11.73%and 4.29%,and normalized total throughput by 6.74%and 2.64%,respectively;MCQG improves link utilization by 23.25%and 15.07%;in addition,the average round-trip delay and packet loss rate fluctuate significantly less than the two compared strategies.展开更多
With the emerging diverse applications in data centers,the demands on quality of service in data centers also become diverse,such as high throughput of elephant flows and low latency of deadline-sensitive flows.Howeve...With the emerging diverse applications in data centers,the demands on quality of service in data centers also become diverse,such as high throughput of elephant flows and low latency of deadline-sensitive flows.However,traditional TCPs are ill-suited to such situations and always result in the inefficiency(e.g.missing the flow deadline,inevitable throughput collapse)of data transfers.This further degrades the user-perceived quality of service(QoS)in data centers.To reduce the flow completion time of mice and deadline-sensitive flows along with promoting the throughput of elephant flows,an efficient and deadline-aware priority-driven congestion control(PCC)protocol,which grants mice and deadline-sensitive flows the highest priority,is proposed in this paper.Specifically,PCC computes the priority of different flows according to the size of transmitted data,the remaining data volume,and the flows’deadline.Then PCC adjusts the congestion window according to the flow priority and the degree of network congestion.Furthermore,switches in data centers control the input/output of packets based on the flow priority and the queue length.Different from existing TCPs,to speed up the data transfers of mice and deadline-sensitive flows,PCC provides an effective method to compute and encode the flow priority explicitly.According to the flow priority,switches can manage packets efficiently and ensure the data transfers of high priority flows through a weighted priority scheduling with minor modification.The experimental results prove that PCC can improve the data transfer performance of mice and deadline-sensitive flows while guaranting the throughput of elephant flows.展开更多
New and emerging use cases, such as the interconnection of geographically distributed data centers(DCs), are drawing attention to the requirement for dynamic end-to-end service provisioning, spanning multiple and hete...New and emerging use cases, such as the interconnection of geographically distributed data centers(DCs), are drawing attention to the requirement for dynamic end-to-end service provisioning, spanning multiple and heterogeneous optical network domains. This heterogeneity is, not only due to the diverse data transmission and switching technologies, but also due to the different options of control plane techniques. In light of this, the problem of heterogeneous control plane interworking needs to be solved, and in particular, the solution must address the specific issues of multi-domain networks, such as limited domain topology visibility, given the scalability and confidentiality constraints. In this article, some of the recent activities regarding the Software-Defined Networking(SDN) orchestration are reviewed to address such a multi-domain control plane interworking problem. Specifically, three different models, including the single SDN controller model, multiple SDN controllers in mesh, and multiple SDN controllers in a hierarchical setting, are presented for the DC interconnection network with multiple SDN/Open Flow domains or multiple Open Flow/Generalized Multi-Protocol Label Switching( GMPLS) heterogeneous domains. I n addition, two concrete implementations of the orchestration architectures are detailed, showing the overall feasibility and procedures of SDN orchestration for the end-to-endservice provisioning in multi-domain data center optical networks.展开更多
1 Introduction The history of data centers can be traced back to the 1960s. Early data centers were deployed on main- frames that were time-shared by users via remote terminals. The boom in data centers came duringthe...1 Introduction The history of data centers can be traced back to the 1960s. Early data centers were deployed on main- frames that were time-shared by users via remote terminals. The boom in data centers came duringthe internet era. Many companies started building large inter- net-connected facililies,展开更多
The technique of incremental updating,which can better guarantee the real-time situation of navigational map,is the developing orientation of navigational road network updating.The data center of vehicle navigation sy...The technique of incremental updating,which can better guarantee the real-time situation of navigational map,is the developing orientation of navigational road network updating.The data center of vehicle navigation system is in charge of storing incremental data,and the spatio-temporal data model for storing incremental data does affect the efficiency of the response of the data center to the requirements of incremental data from the vehicle terminal.According to the analysis on the shortcomings of several typical spatio-temporal data models used in the data center and based on the base map with overlay model,the reverse map with overlay model (RMOM) was put forward for the data center to make rapid response to incremental data request.RMOM supports the data center to store not only the current complete road network data,but also the overlays of incremental data from the time when each road network changed to the current moment.Moreover,the storage mechanism and index structure of the incremental data were designed,and the implementation algorithm of RMOM was developed.Taking navigational road network in Guangzhou City as an example,the simulation test was conducted to validate the efficiency of RMOM.Results show that the navigation database in the data center can response to the requirements of incremental data by only one query with RMOM,and costs less time.Compared with the base map with overlay model,the data center does not need to temporarily overlay incremental data with RMOM,so time-consuming of response is significantly reduced.RMOM greatly improves the efficiency of response and provides strong support for the real-time situation of navigational road network.展开更多
Large latency of applications will bring revenue loss to cloud infrastructure providers in the cloud data center. The existing controllers of software-defined networking architecture can fetch and process traffic info...Large latency of applications will bring revenue loss to cloud infrastructure providers in the cloud data center. The existing controllers of software-defined networking architecture can fetch and process traffic information in the network. Therefore, the controllers can only optimize the network latency of applications. However, the serving latency of applications is also an important factor in delivered user-experience for arrival requests. Unintelligent request routing will cause large serving latency if arrival requests are allocated to overloaded virtual machines. To deal with the request routing problem, this paper proposes the workload-aware software-defined networking controller architecture. Then, request routing algorithms are proposed to minimize the total round trip time for every type of request by considering the congestion in the network and the workload in virtual machines(VMs). This paper finally provides the evaluation of the proposed algorithms in a simulated prototype. The simulation results show that the proposed methodology is efficient compared with the existing approaches.展开更多
According to the high operating costs and a large number of energy waste in the current data center network architectures, we propose a kind of trusted flow preemption scheduling combining the energy-saving routing me...According to the high operating costs and a large number of energy waste in the current data center network architectures, we propose a kind of trusted flow preemption scheduling combining the energy-saving routing mechanism based on typical data center network architecture. The mechanism can make the network flow in its exclusive network link bandwidth and transmission path, which can improve the link utilization and the use of the network energy efficiency. Meanwhile, we apply trusted computing to guarantee the high security, high performance and high fault-tolerant routing forwarding service, which helps improving the average completion time of network flow.展开更多
In data centers, the transmission control protocol(TCP) incast causes catastrophic goodput degradation to applications with a many-to-one traffic pattern. In this paper, we intend to tame incast at the receiver-side a...In data centers, the transmission control protocol(TCP) incast causes catastrophic goodput degradation to applications with a many-to-one traffic pattern. In this paper, we intend to tame incast at the receiver-side application. Towards this goal, we first develop an analytical model that formulates the incast probability as a function of connection variables and network environment settings. We combine the model with the optimization theory and derive some insights into minimizing the incast probability through tuning connection variables related to applications. Then,enlightened by the analytical results, we propose an adaptive application-layer solution to the TCP incast.The solution equally allocates advertised windows to concurrent connections, and dynamically adapts the number of concurrent connections to the varying conditions. Simulation results show that our solution consistently eludes incast and achieves high goodput in various scenarios including the ones with multiple bottleneck links and background TCP traffic.展开更多
Many "rich - connected" topologies with multiple parallel paths between smwers have been proposed for data center networks recently to provide high bisection bandwidth, but it re mains challenging to fully utilize t...Many "rich - connected" topologies with multiple parallel paths between smwers have been proposed for data center networks recently to provide high bisection bandwidth, but it re mains challenging to fully utilize the high network capacity by appropriate multi- path routing algorithms. As flow-level path splitting may lead to trafl'ic imbalance between paths due to flow- size difference, packet-level path splitting attracts more attention lately, which spreads packets from flows into multiple available paths and significantly improves link utilizations. However, it may cause packet reordering, confusing the TCP congestion control algorithm and lowering the throughput of flows. In this paper, we design a novel packetlevel multi-path routing scheme called SOPA, which leverag- es OpenFlow to perform packet-level path splitting in a round- robin fashion, and hence significantly mitigates the packet reordering problem and improves the network throughput. Moreover, SOPA leverages the topological feature of data center networks to encode a very small number of switches along the path into the packet header, resulting in very light overhead. Compared with random packet spraying (RPS), Hedera and equal-cost multi-path routing (ECMP), our simulations demonstrate that SOPA achieves 29.87%, 50.41% and 77.74% higher network throughput respectively under permutation workload, and reduces average data transfer completion time by 53.65%, 343.31% and 348.25% respectively under production workload.展开更多
Virtualization is a common technology for resource sharing in data center. To make efficient use of data center resources, the key challenge is to map customer demands (modeled as virtual data center, VDC) to the ph...Virtualization is a common technology for resource sharing in data center. To make efficient use of data center resources, the key challenge is to map customer demands (modeled as virtual data center, VDC) to the physical data center effectively. In this paper, we focus on this problem. Distinct with previous works, our study of VDC embedding problem is under the assumption that switch resource is the bottleneck of data center networks (DCNs). To this end, we not only propose relative cost to evaluate embedding strategy, decouple embedding problem into VM placement with marginal resource assignment and virtual link mapping with decided source-destination based on the property of fat-tree, but also design the traffic aware embedding algorithm (TAE) and first fit virtual link mapping (FFLM) to map virtual data center requests to a physical data center. Simulation results show that TAE+FFLM could increase acceptance rate and reduce network cost (about 49% in the case) at the same time. The traffie aware embedding algorithm reduces the load of core-link traffic and brings the optimization opportunity for data center network energy conservation.展开更多
The primary focus of this paper is to design a progressive restoration plan for an enterprise data center environment following a partial or full disruption. Repairing and restoring disrupted components in an enterpri...The primary focus of this paper is to design a progressive restoration plan for an enterprise data center environment following a partial or full disruption. Repairing and restoring disrupted components in an enterprise data center requires a significant amount of time and human effort. Following a major disruption, the recovery process involves multiple stages, and during each stage, the partially recovered infrastructures can provide limited services to users at some degraded service level. However, how fast and efficiently an enterprise infrastructure can be recovered de- pends on how the recovery mechanism restores the disrupted components, considering the inter-dependencies between services, along with the limitations of expert human operators. The entire problem turns out to be NP- hard and rather complex, and we devise an efficient meta-heuristic to solve the problem. By considering some real-world examples, we show that the proposed meta-heuristic provides very accurate results, and still runs 600-2800 times faster than the optimal solution obtained from a general purpose mathematical solver [1].展开更多
文摘The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections and convergence.In this paper,with the optimization objective of maximizing network utility while ensuring flows performance-centric weighted fairness,this paper designs a reinforcement learning-based cloud-edge autonomous multi-domain data center network architecture that achieves single-domain autonomy and multi-domain collaboration.Due to the conflict between the utility of different flows,the bandwidth fairness allocation problem for various types of flows is formulated by considering different defined reward functions.Regarding the tradeoff between fairness and utility,this paper deals with the corresponding reward functions for the cases where the flows undergo abrupt changes and smooth changes in the flows.In addition,to accommodate the Quality of Service(QoS)requirements for multiple types of flows,this paper proposes a multi-domain autonomous routing algorithm called LSTM+MADDPG.Introducing a Long Short-Term Memory(LSTM)layer in the actor and critic networks,more information about temporal continuity is added,further enhancing the adaptive ability changes in the dynamic network environment.The LSTM+MADDPG algorithm is compared with the latest reinforcement learning algorithm by conducting experiments on real network topology and traffic traces,and the experimental results show that LSTM+MADDPG improves the delay convergence speed by 14.6%and delays the start moment of packet loss by 18.2%compared with other algorithms.
基金The Deanship of Scientific Research at Hashemite University partially funds this workDeanship of Scientific Research at the Northern Border University,Arar,KSA for funding this research work through the project number“NBU-FFR-2024-1580-08”.
文摘Cloud Datacenter Network(CDN)providers usually have the option to scale their network structures to allow for far more resource capacities,though such scaling options may come with exponential costs that contradict their utility objectives.Yet,besides the cost of the physical assets and network resources,such scaling may also imposemore loads on the electricity power grids to feed the added nodes with the required energy to run and cool,which comes with extra costs too.Thus,those CDNproviders who utilize their resources better can certainly afford their services at lower price-units when compared to others who simply choose the scaling solutions.Resource utilization is a quite challenging process;indeed,clients of CDNs usually tend to exaggerate their true resource requirements when they lease their resources.Service providers are committed to their clients with Service Level Agreements(SLAs).Therefore,any amendment to the resource allocations needs to be approved by the clients first.In this work,we propose deploying a Stackelberg leadership framework to formulate a negotiation game between the cloud service providers and their client tenants.Through this,the providers seek to retrieve those leased unused resources from their clients.Cooperation is not expected from the clients,and they may ask high price units to return their extra resources to the provider’s premises.Hence,to motivate cooperation in such a non-cooperative game,as an extension to theVickery auctions,we developed an incentive-compatible pricingmodel for the returned resources.Moreover,we also proposed building a behavior belief function that shapes the way of negotiation and compensation for each client.Compared to other benchmark models,the assessment results showthat our proposed models provide for timely negotiation schemes,allowing for better resource utilization rates,higher utilities,and grid-friend CDNs.
基金supported by the National Basic Research Program of China(973 Program)(2012CB315903)the Key Science and Technology Innovation Team Project of Zhejiang Province(2011R50010-05)+3 种基金the National Science and Technology Support Program(2014BAH24F01)863 Program of China(2012AA01A507)the National Natural Science Foundation of China(61379118 and 61103200)sponsored by the Research Fund of ZTE Corporation
文摘In a data center network (DCN), load balancing is required when servers transfer data on the same path. This is necessary to avoid congestion. Load balancing is challenged by the dynamic transferral of demands and complex routing control. Because of the distributed nature of a traditional network, previous research on load balancing has mostly focused on improving the performance of the local network; thus, the load has not been optimally balanced across the entire network. In this paper, we propose a novel dynamic load-balancing algorithm for fat-tree. This algorithm avoids congestions to the great possible extent by searching for non-conflicting paths in a centralized way. We implement the algorithm in the popular software-defined networking architecture and evaluate the algorithm' s performance on the Mininet platform. The results show that our algorithm has higher bisection band- width than the traditional equal-cost multi-path load-balancing algorithm and thus more effectively avoids congestion.
文摘With the continuous expansion of the data center network scale, changing network requirements, and increasing pressure on network bandwidth, the traditional network architecture can no longer meet people’s needs. The development of software defined networks has brought new opportunities and challenges to future networks. The data and control separation characteristics of SDN improve the performance of the entire network. Researchers have integrated SDN architecture into data centers to improve network resource utilization and performance. This paper first introduces the basic concepts of SDN and data center networks. Then it discusses SDN-based load balancing mechanisms for data centers from different perspectives. Finally, it summarizes and looks forward to the study on SDN-based load balancing mechanisms and its development trend.
基金This work was supported by the Serbian Ministry of Science and Education(project TR-32022)by companies Telekom Srbija and Informatika.
文摘Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with sufficient resources to facilitate efficient network utilization and minimize delays.In such dynamic networks,links frequently fail or get congested,making the recalculation of the shortest paths a computationally intensive problem.Various routing protocols were proposed to overcome this problem by focusing on network utilization rather than speed.Surprisingly,the design of fast shortest-path algorithms for data centers was largely neglected,though they are universal components of routing protocols.Moreover,parallelization techniques were mostly deployed for random network topologies,and not for regular topologies that are often found in data centers.The aim of this paper is to improve scalability and reduce the time required for the shortest-path calculation in data center networks by parallelization on general-purpose hardware.We propose a novel algorithm that parallelizes edge relaxations as a faster and more scalable solution for popular data center topologies.
基金supported by the National Natural Science Foundation of China (No.62102046,62072249,62072056)JinWang,YongjunRen,and Jinbin Hu receive the grant,and the URLs to the sponsors’websites are https://www.nsfc.gov.cn/.This work is also funded by the National Science Foundation of Hunan Province (No.2022JJ30618,2020JJ2029).
文摘In the Ethernet lossless Data Center Networks (DCNs) deployedwith Priority-based Flow Control (PFC), the head-of-line blocking problemis still difficult to prevent due to PFC triggering under burst trafficscenarios even with the existing congestion control solutions. To addressthe head-of-line blocking problem of PFC, we propose a new congestioncontrol mechanism. The key point of Congestion Control Using In-NetworkTelemetry for Lossless Datacenters (ICC) is to use In-Network Telemetry(INT) technology to obtain comprehensive congestion information, which isthen fed back to the sender to adjust the sending rate timely and accurately.It is possible to control congestion in time, converge to the target rate quickly,and maintain a near-zero queue length at the switch when using ICC. Weconducted Network Simulator-3 (NS-3) simulation experiments to test theICC’s performance. When compared to Congestion Control for Large-ScaleRDMA Deployments (DCQCN), TIMELY: RTT-based Congestion Controlfor the Datacenter (TIMELY), and Re-architecting Congestion Managementin Lossless Ethernet (PCN), ICC effectively reduces PFC pause messages andFlow Completion Time (FCT) by 47%, 56%, 34%, and 15.3×, 14.8×, and11.2×, respectively.
基金This work was supported by the Hainan Provincial Natural Science Foundation of China(620RC560,2019RC096,620RC562)the Scientific Research Setup Fund of Hainan University(KYQD(ZR)1877)+2 种基金the National Natural Science Foundation of China(62162021,82160345,61802092)the key research and development program of Hainan province(ZDYF2020199,ZDYF2021GXJS017)the key science and technology plan project of Haikou(2011-016).
文摘As a critical infrastructure of cloud computing,data center networks(DCNs)directly determine the service performance of data centers,which provide computing services for various applications such as big data processing and artificial intelligence.However,current architectures of data center networks suffer from a long routing path and a low fault tolerance between source and destination servers,which is hard to satisfy the requirements of high-performance data center networks.Based on dual-port servers and Clos network structure,this paper proposed a novel architecture RClos to construct high-performance data center networks.Logically,the proposed architecture is constructed by inserting a dual-port server into each pair of adjacent switches in the fabric of switches,where switches are connected in the form of a ring Clos structure.We describe the structural properties of RClos in terms of network scale,bisection bandwidth,and network diameter.RClos architecture inherits characteristics of its embedded Clos network,which can accommodate a large number of servers with a small average path length.The proposed architecture embraces a high fault tolerance,which adapts to the construction of various data center networks.For example,the average path length between servers is 3.44,and the standardized bisection bandwidth is 0.8 in RClos(32,5).The result of numerical experiments shows that RClos enjoys a small average path length and a high network fault tolerance,which is essential in the construction of high-performance data center networks.
基金This work is funded by the National Natural Science Foundation of China under Grant No.61772180the Key R&D plan of Hubei Province(2020BHB004,2020BAB012).
文摘According to Cisco’s Internet Report 2020 white paper,there will be 29.3 billion connected devices worldwide by 2023,up from 18.4 billion in 2018.5G connections will generate nearly three times more traffic than 4G connections.While bringing a boom to the network,it also presents unprecedented challenges in terms of flow forwarding decisions.The path assignment mechanism used in traditional traffic schedulingmethods tends to cause local network congestion caused by the concentration of elephant flows,resulting in unbalanced network load and degraded quality of service.Using the centralized control of software-defined networks,this study proposes a data center traffic scheduling strategy for minimization congestion and quality of service guaranteeing(MCQG).The ideal transmission path is selected for data flows while considering the network congestion rate and quality of service.Different traffic scheduling strategies are used according to the characteristics of different service types in data centers.Reroute scheduling for elephant flows that tend to cause local congestion.The path evaluation function is formed by the maximum link utilization on the path,the number of elephant flows and the time delay,and the fast merit-seeking capability of the sparrow search algorithm is used to find the path with the lowest actual link overhead as the rerouting path for the elephant flows.It is used to reduce the possibility of local network congestion occurrence.Equal cost multi-path(ECMP)protocols with faster response time are used to schedulemouse flows with shorter duration.Used to guarantee the quality of service of the network.To achieve isolated transmission of various types of data streams.The experimental results show that the proposed strategy has higher throughput,better network load balancing,and better robustness compared to ECMP under different traffic models.In addition,because it can fully utilize the resources in the network,MCQG also outperforms another traffic scheduling strategy that does rerouting for elephant flows(namely Hedera).Compared withECMPandHedera,MCQGimproves average throughput by 11.73%and 4.29%,and normalized total throughput by 6.74%and 2.64%,respectively;MCQG improves link utilization by 23.25%and 15.07%;in addition,the average round-trip delay and packet loss rate fluctuate significantly less than the two compared strategies.
基金supported part by the National Natural Science Foundation of China(61601252,61801254)Public Technology Projects of Zhejiang Province(LG-G18F020007)+1 种基金Zhejiang Provincial Natural Science Foundation of China(LY20F020008,LY18F020011,LY20F010004)K.C.Wong Magna Fund in Ningbo University。
文摘With the emerging diverse applications in data centers,the demands on quality of service in data centers also become diverse,such as high throughput of elephant flows and low latency of deadline-sensitive flows.However,traditional TCPs are ill-suited to such situations and always result in the inefficiency(e.g.missing the flow deadline,inevitable throughput collapse)of data transfers.This further degrades the user-perceived quality of service(QoS)in data centers.To reduce the flow completion time of mice and deadline-sensitive flows along with promoting the throughput of elephant flows,an efficient and deadline-aware priority-driven congestion control(PCC)protocol,which grants mice and deadline-sensitive flows the highest priority,is proposed in this paper.Specifically,PCC computes the priority of different flows according to the size of transmitted data,the remaining data volume,and the flows’deadline.Then PCC adjusts the congestion window according to the flow priority and the degree of network congestion.Furthermore,switches in data centers control the input/output of packets based on the flow priority and the queue length.Different from existing TCPs,to speed up the data transfers of mice and deadline-sensitive flows,PCC provides an effective method to compute and encode the flow priority explicitly.According to the flow priority,switches can manage packets efficiently and ensure the data transfers of high priority flows through a weighted priority scheduling with minor modification.The experimental results prove that PCC can improve the data transfer performance of mice and deadline-sensitive flows while guaranting the throughput of elephant flows.
文摘New and emerging use cases, such as the interconnection of geographically distributed data centers(DCs), are drawing attention to the requirement for dynamic end-to-end service provisioning, spanning multiple and heterogeneous optical network domains. This heterogeneity is, not only due to the diverse data transmission and switching technologies, but also due to the different options of control plane techniques. In light of this, the problem of heterogeneous control plane interworking needs to be solved, and in particular, the solution must address the specific issues of multi-domain networks, such as limited domain topology visibility, given the scalability and confidentiality constraints. In this article, some of the recent activities regarding the Software-Defined Networking(SDN) orchestration are reviewed to address such a multi-domain control plane interworking problem. Specifically, three different models, including the single SDN controller model, multiple SDN controllers in mesh, and multiple SDN controllers in a hierarchical setting, are presented for the DC interconnection network with multiple SDN/Open Flow domains or multiple Open Flow/Generalized Multi-Protocol Label Switching( GMPLS) heterogeneous domains. I n addition, two concrete implementations of the orchestration architectures are detailed, showing the overall feasibility and procedures of SDN orchestration for the end-to-endservice provisioning in multi-domain data center optical networks.
基金supported by the ZTE-BJTU Collaborative Research Program under Grant No. K11L00190the Fundamental Research Funds for the Central Universities under Grant No. K12JB00060
文摘1 Introduction The history of data centers can be traced back to the 1960s. Early data centers were deployed on main- frames that were time-shared by users via remote terminals. The boom in data centers came duringthe internet era. Many companies started building large inter- net-connected facililies,
基金Under the auspices of National High Technology Research and Development Program of China (No.2007AA12Z242)
文摘The technique of incremental updating,which can better guarantee the real-time situation of navigational map,is the developing orientation of navigational road network updating.The data center of vehicle navigation system is in charge of storing incremental data,and the spatio-temporal data model for storing incremental data does affect the efficiency of the response of the data center to the requirements of incremental data from the vehicle terminal.According to the analysis on the shortcomings of several typical spatio-temporal data models used in the data center and based on the base map with overlay model,the reverse map with overlay model (RMOM) was put forward for the data center to make rapid response to incremental data request.RMOM supports the data center to store not only the current complete road network data,but also the overlays of incremental data from the time when each road network changed to the current moment.Moreover,the storage mechanism and index structure of the incremental data were designed,and the implementation algorithm of RMOM was developed.Taking navigational road network in Guangzhou City as an example,the simulation test was conducted to validate the efficiency of RMOM.Results show that the navigation database in the data center can response to the requirements of incremental data by only one query with RMOM,and costs less time.Compared with the base map with overlay model,the data center does not need to temporarily overlay incremental data with RMOM,so time-consuming of response is significantly reduced.RMOM greatly improves the efficiency of response and provides strong support for the real-time situation of navigational road network.
基金supported by the National Postdoctoral Science Foundation of China(2014M550068)
文摘Large latency of applications will bring revenue loss to cloud infrastructure providers in the cloud data center. The existing controllers of software-defined networking architecture can fetch and process traffic information in the network. Therefore, the controllers can only optimize the network latency of applications. However, the serving latency of applications is also an important factor in delivered user-experience for arrival requests. Unintelligent request routing will cause large serving latency if arrival requests are allocated to overloaded virtual machines. To deal with the request routing problem, this paper proposes the workload-aware software-defined networking controller architecture. Then, request routing algorithms are proposed to minimize the total round trip time for every type of request by considering the congestion in the network and the workload in virtual machines(VMs). This paper finally provides the evaluation of the proposed algorithms in a simulated prototype. The simulation results show that the proposed methodology is efficient compared with the existing approaches.
基金supported by the National Natural Science Foundation of China(The key trusted running technologies for the sensing nodes in Internet of things: 61501007The outstanding personnel training program of Beijing municipal Party Committee Organization Department (The Research of Trusted Computing environment for Internet of things in Smart City: 2014000020124G041
文摘According to the high operating costs and a large number of energy waste in the current data center network architectures, we propose a kind of trusted flow preemption scheduling combining the energy-saving routing mechanism based on typical data center network architecture. The mechanism can make the network flow in its exclusive network link bandwidth and transmission path, which can improve the link utilization and the use of the network energy efficiency. Meanwhile, we apply trusted computing to guarantee the high security, high performance and high fault-tolerant routing forwarding service, which helps improving the average completion time of network flow.
基金supported by the Fundamental Research Fundsfor the Central Universities under Grant No.ZYGX2015J009the Sichuan Province Scientific and Technological Support Project under Grants No.2014GZ0017 and No.2016GZ0093
文摘In data centers, the transmission control protocol(TCP) incast causes catastrophic goodput degradation to applications with a many-to-one traffic pattern. In this paper, we intend to tame incast at the receiver-side application. Towards this goal, we first develop an analytical model that formulates the incast probability as a function of connection variables and network environment settings. We combine the model with the optimization theory and derive some insights into minimizing the incast probability through tuning connection variables related to applications. Then,enlightened by the analytical results, we propose an adaptive application-layer solution to the TCP incast.The solution equally allocates advertised windows to concurrent connections, and dynamically adapts the number of concurrent connections to the varying conditions. Simulation results show that our solution consistently eludes incast and achieves high goodput in various scenarios including the ones with multiple bottleneck links and background TCP traffic.
基金supported by the National Basic Research Program of China(973 program)under Grant No.2014CB347800 and No.2012CB315803the National High-Tech R&D Program of China(863 program)under Grant No.2013AA013303+1 种基金the Natural Science Foundation of China under Grant No.61170291,No.61133006,and No.61161140454ZTE IndustryAcademia-Research Cooperation Funds
文摘Many "rich - connected" topologies with multiple parallel paths between smwers have been proposed for data center networks recently to provide high bisection bandwidth, but it re mains challenging to fully utilize the high network capacity by appropriate multi- path routing algorithms. As flow-level path splitting may lead to trafl'ic imbalance between paths due to flow- size difference, packet-level path splitting attracts more attention lately, which spreads packets from flows into multiple available paths and significantly improves link utilizations. However, it may cause packet reordering, confusing the TCP congestion control algorithm and lowering the throughput of flows. In this paper, we design a novel packetlevel multi-path routing scheme called SOPA, which leverag- es OpenFlow to perform packet-level path splitting in a round- robin fashion, and hence significantly mitigates the packet reordering problem and improves the network throughput. Moreover, SOPA leverages the topological feature of data center networks to encode a very small number of switches along the path into the packet header, resulting in very light overhead. Compared with random packet spraying (RPS), Hedera and equal-cost multi-path routing (ECMP), our simulations demonstrate that SOPA achieves 29.87%, 50.41% and 77.74% higher network throughput respectively under permutation workload, and reduces average data transfer completion time by 53.65%, 343.31% and 348.25% respectively under production workload.
基金This research was partially supported by the National Grand Fundamental Research 973 Program of China under Grant (No. 2013CB329103), Natural Science Foundation of China grant (No. 61271171), the Fundamental Research Funds for the Central Universities (ZYGX2013J002, ZYGX2012J004, ZYGX2010J002, ZYGX2010J009), Guangdong Science and Technology Project (2012B090500003, 2012B091000163, 2012556031).
文摘Virtualization is a common technology for resource sharing in data center. To make efficient use of data center resources, the key challenge is to map customer demands (modeled as virtual data center, VDC) to the physical data center effectively. In this paper, we focus on this problem. Distinct with previous works, our study of VDC embedding problem is under the assumption that switch resource is the bottleneck of data center networks (DCNs). To this end, we not only propose relative cost to evaluate embedding strategy, decouple embedding problem into VM placement with marginal resource assignment and virtual link mapping with decided source-destination based on the property of fat-tree, but also design the traffic aware embedding algorithm (TAE) and first fit virtual link mapping (FFLM) to map virtual data center requests to a physical data center. Simulation results show that TAE+FFLM could increase acceptance rate and reduce network cost (about 49% in the case) at the same time. The traffie aware embedding algorithm reduces the load of core-link traffic and brings the optimization opportunity for data center network energy conservation.
文摘The primary focus of this paper is to design a progressive restoration plan for an enterprise data center environment following a partial or full disruption. Repairing and restoring disrupted components in an enterprise data center requires a significant amount of time and human effort. Following a major disruption, the recovery process involves multiple stages, and during each stage, the partially recovered infrastructures can provide limited services to users at some degraded service level. However, how fast and efficiently an enterprise infrastructure can be recovered de- pends on how the recovery mechanism restores the disrupted components, considering the inter-dependencies between services, along with the limitations of expert human operators. The entire problem turns out to be NP- hard and rather complex, and we devise an efficient meta-heuristic to solve the problem. By considering some real-world examples, we show that the proposed meta-heuristic provides very accurate results, and still runs 600-2800 times faster than the optimal solution obtained from a general purpose mathematical solver [1].