An important theoretic interest is to study the relations between different interconnection networks, and to compare the capability and performance of the network structures. The most popular way to do the investigati...An important theoretic interest is to study the relations between different interconnection networks, and to compare the capability and performance of the network structures. The most popular way to do the investigation is network emulation. Based on the classical voltage graph theory, the authors develop a new representation scheme for interconnection network structures. The new approach is a combination of algebraic methods and combinatorial methods. The results demonstrate that the voltage graph theory is a powerful tool for representing well known interconnection networks and in implementing optimal network emulation algorithms, and in particular, show that all popular interconnection networks have very simple and intuitive representations under the new scheme. The new representation scheme also offers powerful tools for the study of network routings and emulations. For example, we present very simple constructions for optimal network emulations from the cube connected cycles networks to the butterfly networks, and from the butterfly networks to the hypercube networks. Compared with the most popular way of network emulation, this new scheme is intuitive and easy to realize, and easy to apply to other network structures.展开更多
All-to-all personalized communication,or complete exchange,is at the heart of numerous applications in paral-lel computing.It is one of the most dense communication patterns.In this paper,we consider this problem in a...All-to-all personalized communication,or complete exchange,is at the heart of numerous applications in paral-lel computing.It is one of the most dense communication patterns.In this paper,we consider this problem in a2D/3D mesh and a multidimensional interconnection network with the wormhole-routing capability.We propose complete ex-change algorithms for them respectively.We propose O(mn 2 )phase algorithm for2D mesh P m ×P n and O(mn 2 l 2 )phase algo-rithm for3D mesh P m ×P n ×P l ,where m,n,l are any positive integers.Also O(ph(G 1 )n 2 )phase algorithm is proposed for a multidimensional interconnection network G 1 ×G 2 ,where ph(G 1 )stands for complete exchange phases of G 1 and|G 2 |=n.展开更多
Based on Petersen graph, a new interconnection network, the RP(k) network, is devel-oped and the properties of the RP(k) network are investigated. The diameter of the RP(k) network is [ k/2] + 2 and its degree is 5. W...Based on Petersen graph, a new interconnection network, the RP(k) network, is devel-oped and the properties of the RP(k) network are investigated. The diameter of the RP(k) network is [ k/2] + 2 and its degree is 5. We prove that the diameter of the RP(k) network is much smaller than that of the 2-D Torus network when the number of nodes in interconnection networks is less than or equal to 300. In order to analyze the communication performance in a group of nodes, we propose the concepts of the optimal node groups and the diameter of the optimal node groups. We also show that the diameter of the optimal node groups in the RP(k) network is less than that in the 2-D Torus net-work. Especially when the number of nodes in an optimal node group is between 6 and 100, the diam-eter of the optimal node groups in the RP(k) network is half of that in the 2-D Torus network. Further-more based on the RP(k) network we design a set of routing algorithms which are point-to-point rout-ing, permutation routing, one-to-all routing and all-to-all routing. Their communication efficiencies are [ k/2] +2, k + 5, [k/2] + 2, and k + 5 respectively. The RP(k) network and the routing algorithms can provide efficient communication means for parallel and distributed computer system.展开更多
To efficiently complete a complex computation task,the complex task should be decomposed into subcomputation tasks that run parallel in edge computing.Wireless Sensor Network(WSN)is a typical application of parallel c...To efficiently complete a complex computation task,the complex task should be decomposed into subcomputation tasks that run parallel in edge computing.Wireless Sensor Network(WSN)is a typical application of parallel computation.To achieve highly reliable parallel computation for wireless sensor network,the network's lifetime needs to be extended.Therefore,a proper task allocation strategy is needed to reduce the energy consumption and balance the load of the network.This paper proposes a task model and a cluster-based WSN model in edge computing.In our model,different tasks require different types of resources and different sensors provide different types of resources,so our model is heterogeneous,which makes the model more practical.Then we propose a task allocation algorithm that combines the Genetic Algorithm(GA)and the Ant Colony Optimization(ACO)algorithm.The algorithm concentrates on energy conservation and load balancing so that the lifetime of the network can be extended.The experimental result shows the algorithm's effectiveness and advantages in energy conservation and load balancing.展开更多
A routing algorithm for distributed optimal double loop computer networks is proposed and analyzed. In this paper, the routing algorithm rule is described, and the procedures realizing the algorithm are given. The pr...A routing algorithm for distributed optimal double loop computer networks is proposed and analyzed. In this paper, the routing algorithm rule is described, and the procedures realizing the algorithm are given. The proposed algorithm is shown to be optimal and robust for optimal double loop. In the absence of failures,the algorithm can send a packet along the shortest path to destination; when there are failures,the packet can bypasss failed nodes and links.展开更多
A methodology is proposed to handle problem that under equiproble address of packet traffic at the input port, Generalized Shuffle-Exchange Network (GSEN) routes traffic unevenly because of the unbalanced routing tags...A methodology is proposed to handle problem that under equiproble address of packet traffic at the input port, Generalized Shuffle-Exchange Network (GSEN) routes traffic unevenly because of the unbalanced routing tags. The idea is to use routing tag according to probability, which can be evaluated by using Moore-Penrose inverse in matrix analysis. An instance is used to illustrate the idea, and the simulation is done to show the improvement in performance issues.展开更多
Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with s...Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with sufficient resources to facilitate efficient network utilization and minimize delays.In such dynamic networks,links frequently fail or get congested,making the recalculation of the shortest paths a computationally intensive problem.Various routing protocols were proposed to overcome this problem by focusing on network utilization rather than speed.Surprisingly,the design of fast shortest-path algorithms for data centers was largely neglected,though they are universal components of routing protocols.Moreover,parallelization techniques were mostly deployed for random network topologies,and not for regular topologies that are often found in data centers.The aim of this paper is to improve scalability and reduce the time required for the shortest-path calculation in data center networks by parallelization on general-purpose hardware.We propose a novel algorithm that parallelizes edge relaxations as a faster and more scalable solution for popular data center topologies.展开更多
A neurocomputing model for Genetic Algorithm (GA) to break the speed bottleneck of GA was proposed. With all genetic operations parallel implemented by NN-based sub-modules, the model integrates both the strongpoint o...A neurocomputing model for Genetic Algorithm (GA) to break the speed bottleneck of GA was proposed. With all genetic operations parallel implemented by NN-based sub-modules, the model integrates both the strongpoint of parallel GA (PGA) and those of hardware GA (HGA). Moreover a new crossover operator named universe crossover was also proposed to suit the NN-based realization. This model was tested with a benchmark function set, and the experimental results validated the potential of the neurocomputing model. The significance of this model means that HGA and PGA can be integrated and the inherent parallelism of GA can be explicitly and farthest realized, as a result, the optimization speed of GA will be accelerated by one or two magnitudes compered to the serial implementation with same speed hardware, and GA will be turned from an algorithm into a machine.展开更多
To design a Banyan network with an arbitrary even-sized port number, the PN2I network is proposed. The PN2I network can be divided into two classes: the complete and incomplete versions. A simple routing algorithm is ...To design a Banyan network with an arbitrary even-sized port number, the PN2I network is proposed. The PN2I network can be divided into two classes: the complete and incomplete versions. A simple routing algorithm is given, but in the incomplete PN2I networks,this routing algorithm fails to make the traffic in links even, which deteriorates the performance badly. Thus a new routing algorithm is proposed, which makes the incomplete PN2I networks behave almost the same as the PN2I networks with respect to the performance issues.展开更多
In this paper, an optimum tactic of multi-grid parallel algorithmwith virtual boundary forecast method is disscussed, and a two-stage implementationis presented. The numerical results of solving a non-linear heat tran...In this paper, an optimum tactic of multi-grid parallel algorithmwith virtual boundary forecast method is disscussed, and a two-stage implementationis presented. The numerical results of solving a non-linear heat transfer equationshow that the optimum implementation is much better than the non-optimum one.展开更多
In order to improve the network performance furthermore, a routing algorithm for 2D-Torus is investigated from the standpoint of load balance for virtual channels. The 2D-Torus network is divided into two virtual netw...In order to improve the network performance furthermore, a routing algorithm for 2D-Torus is investigated from the standpoint of load balance for virtual channels. The 2D-Torus network is divided into two virtual networks and each physical channel is split into three virtual channels. A novel virtual channel allocation policy and a routing algorithm are proposed, in which traffic load is distributed to those three virtual channels in a more load-balanced manner by introducing a random parameter. Simulations of the proposed algorithm are developed with a SystemC-based test bench. The results show that compared with the negative first for Torus networks (NF-T) algorithm, the proposed algorithm can achieve better performance in terms of network latency and throughput under different traffic patterns. It also shows that a routing algorithm with load balance for virtual channels can significantly improve the network performance furthermore.展开更多
To improve the scalability and reduce the implementation complexity of Mesh and Mesh-like networks, the semi-diagonal Torus (SD-Torus) network, a regular and symmetrical intercormection network is proposed. The SD-T...To improve the scalability and reduce the implementation complexity of Mesh and Mesh-like networks, the semi-diagonal Torus (SD-Torus) network, a regular and symmetrical intercormection network is proposed. The SD-Torus network is a combination of a typical 2D-Torus network with two extra diagonal links from northwest to southeast direction for each node. The topological properties of SD-Torus networks are discussed, and a load balanced routing algorithm for SD-Torus is presented. System-C based simulation result shows that, compared with diagonal Mesh (DMesh), diagonal Torus (DTorus) and XMesh networks, the SD-Torus network can achieve high performance with a lower network cost. It makes the SD-Torus network a powerful candidate for the high performance interconnection networks.展开更多
如何提供不同的服务质量(quality of service,简称QoS)是互联网络面临的一个重要问题,而服务质量路由(quality-of-service routing,简称QoSR)则是其中的核心技术和热点问题.QoSR的主要作用是为QoS业务请求寻找可行路径,这体现了QoSR的...如何提供不同的服务质量(quality of service,简称QoS)是互联网络面临的一个重要问题,而服务质量路由(quality-of-service routing,简称QoSR)则是其中的核心技术和热点问题.QoSR的主要作用是为QoS业务请求寻找可行路径,这体现了QoSR的两个目标:(1) 满足业务QoS需求;(2) 最大限度地提高网络利用率.由于QoSR是NP完全问题,研究者们设计了很多启发式算法进行了广泛深入的研究.在有权图和QoS度量的基础上介绍了QoSR的基本概念,详细分析了面向单播应用的QoSR算法中的热点问题,并按照所求解的问题类型和求解方法,将这些算法分成以下几类:多项式非启发类、伪多项式非启发类、探测类、限定QoS度量类、路径子空间搜索类、QoS度量相关类、花费函数类和概率求解类.在分析每类中典型算法的基础上,总结和对比了各类的特点,进而详细剖析了算法的有效性,并基于此总结了基于概率模型求解QoSR问题的方法.最后指出了该领域中需要进一步研究的热点问题.展开更多
基金TheNationalScienceFundforOverseasDistinguishedYoungScholars (No .6 992 82 0 1) ,FoundationforUniversityKeyTeacherbytheMinistryofEducationandChangjiangScholarRewardProject.
文摘An important theoretic interest is to study the relations between different interconnection networks, and to compare the capability and performance of the network structures. The most popular way to do the investigation is network emulation. Based on the classical voltage graph theory, the authors develop a new representation scheme for interconnection network structures. The new approach is a combination of algebraic methods and combinatorial methods. The results demonstrate that the voltage graph theory is a powerful tool for representing well known interconnection networks and in implementing optimal network emulation algorithms, and in particular, show that all popular interconnection networks have very simple and intuitive representations under the new scheme. The new representation scheme also offers powerful tools for the study of network routings and emulations. For example, we present very simple constructions for optimal network emulations from the cube connected cycles networks to the butterfly networks, and from the butterfly networks to the hypercube networks. Compared with the most popular way of network emulation, this new scheme is intuitive and easy to realize, and easy to apply to other network structures.
文摘All-to-all personalized communication,or complete exchange,is at the heart of numerous applications in paral-lel computing.It is one of the most dense communication patterns.In this paper,we consider this problem in a2D/3D mesh and a multidimensional interconnection network with the wormhole-routing capability.We propose complete ex-change algorithms for them respectively.We propose O(mn 2 )phase algorithm for2D mesh P m ×P n and O(mn 2 l 2 )phase algo-rithm for3D mesh P m ×P n ×P l ,where m,n,l are any positive integers.Also O(ph(G 1 )n 2 )phase algorithm is proposed for a multidimensional interconnection network G 1 ×G 2 ,where ph(G 1 )stands for complete exchange phases of G 1 and|G 2 |=n.
基金This work was supported by the National Natural Science Foundation of China (Grant No. 69933020) National High Performance Computing Fund.
文摘Based on Petersen graph, a new interconnection network, the RP(k) network, is devel-oped and the properties of the RP(k) network are investigated. The diameter of the RP(k) network is [ k/2] + 2 and its degree is 5. We prove that the diameter of the RP(k) network is much smaller than that of the 2-D Torus network when the number of nodes in interconnection networks is less than or equal to 300. In order to analyze the communication performance in a group of nodes, we propose the concepts of the optimal node groups and the diameter of the optimal node groups. We also show that the diameter of the optimal node groups in the RP(k) network is less than that in the 2-D Torus net-work. Especially when the number of nodes in an optimal node group is between 6 and 100, the diam-eter of the optimal node groups in the RP(k) network is half of that in the 2-D Torus network. Further-more based on the RP(k) network we design a set of routing algorithms which are point-to-point rout-ing, permutation routing, one-to-all routing and all-to-all routing. Their communication efficiencies are [ k/2] +2, k + 5, [k/2] + 2, and k + 5 respectively. The RP(k) network and the routing algorithms can provide efficient communication means for parallel and distributed computer system.
基金supported by Postdoctoral Science Foundation of China(No.2021M702441)National Natural Science Foundation of China(No.61871283)。
文摘To efficiently complete a complex computation task,the complex task should be decomposed into subcomputation tasks that run parallel in edge computing.Wireless Sensor Network(WSN)is a typical application of parallel computation.To achieve highly reliable parallel computation for wireless sensor network,the network's lifetime needs to be extended.Therefore,a proper task allocation strategy is needed to reduce the energy consumption and balance the load of the network.This paper proposes a task model and a cluster-based WSN model in edge computing.In our model,different tasks require different types of resources and different sensors provide different types of resources,so our model is heterogeneous,which makes the model more practical.Then we propose a task allocation algorithm that combines the Genetic Algorithm(GA)and the Ant Colony Optimization(ACO)algorithm.The algorithm concentrates on energy conservation and load balancing so that the lifetime of the network can be extended.The experimental result shows the algorithm's effectiveness and advantages in energy conservation and load balancing.
文摘A routing algorithm for distributed optimal double loop computer networks is proposed and analyzed. In this paper, the routing algorithm rule is described, and the procedures realizing the algorithm are given. The proposed algorithm is shown to be optimal and robust for optimal double loop. In the absence of failures,the algorithm can send a packet along the shortest path to destination; when there are failures,the packet can bypasss failed nodes and links.
基金Supported by the National High-Tech Programs(No.2002AA103062, No.2002AA121061 and No.2003AA103520) the Huawei Technologies Co. under contract number YBCN2002001.
文摘A methodology is proposed to handle problem that under equiproble address of packet traffic at the input port, Generalized Shuffle-Exchange Network (GSEN) routes traffic unevenly because of the unbalanced routing tags. The idea is to use routing tag according to probability, which can be evaluated by using Moore-Penrose inverse in matrix analysis. An instance is used to illustrate the idea, and the simulation is done to show the improvement in performance issues.
基金This work was supported by the Serbian Ministry of Science and Education(project TR-32022)by companies Telekom Srbija and Informatika.
文摘Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with sufficient resources to facilitate efficient network utilization and minimize delays.In such dynamic networks,links frequently fail or get congested,making the recalculation of the shortest paths a computationally intensive problem.Various routing protocols were proposed to overcome this problem by focusing on network utilization rather than speed.Surprisingly,the design of fast shortest-path algorithms for data centers was largely neglected,though they are universal components of routing protocols.Moreover,parallelization techniques were mostly deployed for random network topologies,and not for regular topologies that are often found in data centers.The aim of this paper is to improve scalability and reduce the time required for the shortest-path calculation in data center networks by parallelization on general-purpose hardware.We propose a novel algorithm that parallelizes edge relaxations as a faster and more scalable solution for popular data center topologies.
基金NationalNaturalScienceFoundationofChina (No .60 2 3 40 2 0 )
文摘A neurocomputing model for Genetic Algorithm (GA) to break the speed bottleneck of GA was proposed. With all genetic operations parallel implemented by NN-based sub-modules, the model integrates both the strongpoint of parallel GA (PGA) and those of hardware GA (HGA). Moreover a new crossover operator named universe crossover was also proposed to suit the NN-based realization. This model was tested with a benchmark function set, and the experimental results validated the potential of the neurocomputing model. The significance of this model means that HGA and PGA can be integrated and the inherent parallelism of GA can be explicitly and farthest realized, as a result, the optimization speed of GA will be accelerated by one or two magnitudes compered to the serial implementation with same speed hardware, and GA will be turned from an algorithm into a machine.
基金Supported by the National High-Tech Programs(No.2002AAl03062, No.2002AA121061 and No.2003AA103520)the Huawei Technologies Co. (No.YBCN2002001).
文摘To design a Banyan network with an arbitrary even-sized port number, the PN2I network is proposed. The PN2I network can be divided into two classes: the complete and incomplete versions. A simple routing algorithm is given, but in the incomplete PN2I networks,this routing algorithm fails to make the traffic in links even, which deteriorates the performance badly. Thus a new routing algorithm is proposed, which makes the incomplete PN2I networks behave almost the same as the PN2I networks with respect to the performance issues.
文摘In this paper, an optimum tactic of multi-grid parallel algorithmwith virtual boundary forecast method is disscussed, and a two-stage implementationis presented. The numerical results of solving a non-linear heat transfer equationshow that the optimum implementation is much better than the non-optimum one.
基金supported by the National Natural Science Foundation of China (60976020)
文摘In order to improve the network performance furthermore, a routing algorithm for 2D-Torus is investigated from the standpoint of load balance for virtual channels. The 2D-Torus network is divided into two virtual networks and each physical channel is split into three virtual channels. A novel virtual channel allocation policy and a routing algorithm are proposed, in which traffic load is distributed to those three virtual channels in a more load-balanced manner by introducing a random parameter. Simulations of the proposed algorithm are developed with a SystemC-based test bench. The results show that compared with the negative first for Torus networks (NF-T) algorithm, the proposed algorithm can achieve better performance in terms of network latency and throughput under different traffic patterns. It also shows that a routing algorithm with load balance for virtual channels can significantly improve the network performance furthermore.
基金supported by the National Natural Science Foundation of China (60976020)
文摘To improve the scalability and reduce the implementation complexity of Mesh and Mesh-like networks, the semi-diagonal Torus (SD-Torus) network, a regular and symmetrical intercormection network is proposed. The SD-Torus network is a combination of a typical 2D-Torus network with two extra diagonal links from northwest to southeast direction for each node. The topological properties of SD-Torus networks are discussed, and a load balanced routing algorithm for SD-Torus is presented. System-C based simulation result shows that, compared with diagonal Mesh (DMesh), diagonal Torus (DTorus) and XMesh networks, the SD-Torus network can achieve high performance with a lower network cost. It makes the SD-Torus network a powerful candidate for the high performance interconnection networks.