Most of users are accustomed to utilizing virtual address in their parallel programs running at the scalable high-performance parallel computing systems.Therefore a virtual and physical address translation mechanism i...Most of users are accustomed to utilizing virtual address in their parallel programs running at the scalable high-performance parallel computing systems.Therefore a virtual and physical address translation mechanism is necessary and crucial to bridge the hardware interface and software application.In this paper,a new virtual and physical translation mechanism is proposed,which includes an address validity checker,an address translation cache(ATC),a complete refresh scheme and many reliability designs.The ATC employs a large capacity embedded dynamic random access memory(eDRAM)to meet the high hit ratio requirement.It also can switch the cache and buffer mode to avoid the high latency of accessing the main memory outside.Many tests have been conducted on the real chip,which implements the address translation mechanism.The results show that the ATC has a high hit ratio while running the well-known benchmarks,and additionally demonstrates that the new high-performance mechanism is well designed.展开更多
The reliability of a network is an important indicator for maintaining communication and ensuring its stable operation. Therefore, the assessment of reliability in underlying interconnection networks has become an inc...The reliability of a network is an important indicator for maintaining communication and ensuring its stable operation. Therefore, the assessment of reliability in underlying interconnection networks has become an increasingly important research issue. However, at present, the reliability assessment of many interconnected networks is not yet accurate,which inevitably weakens their fault tolerance and diagnostic capabilities. To improve network reliability,researchers have proposed various methods and strategies for precise assessment. This paper introduces a novel family of interconnection networks called general matching composed networks(gMCNs), which is based on the common characteristics of network topology structure. After analyzing the topological properties of gMCNs, we establish a relationship between super connectivity and conditional diagnosability of gMCNs. Furthermore, we assess the reliability of g MCNs, and determine the conditional diagnosability of many interconnection networks.展开更多
In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and network interface chips, and highlight a set of hardware and software features e...In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and network interface chips, and highlight a set of hardware and software features effectively supporting high performance communications, ranging over remote direct memory access, collective optimization, hardwareenable reliable end-to-end communication, user-level message passing services, etc. Measured hardware performance results are also presented.展开更多
In order to indicate the performances of a large-scale communication network with domain partition and interconnection today, a kind of reliability index weighed by normalized capacity is defined. Based on the route r...In order to indicate the performances of a large-scale communication network with domain partition and interconnection today, a kind of reliability index weighed by normalized capacity is defined. Based on the route rules of network with domain partition and interconnection, the interconnection indexes among the nodes within the domain and among the domains are given from several aspects. It is expatiated on that the index can thoroughly represent the effect on the reliability index of the objective factor and the subjective measures of the designer, which obeys the route rules of a network with domain partition and interconnection. It is discussed that the defined index is rational and compatible with the traditional index.展开更多
The Novel Interconnection Network (NIN) based on inverted-graph topology and crossbar switch is a kind of lower latency and higher throughput interconnection network. But it has a vital disadvantage, high hardware com...The Novel Interconnection Network (NIN) based on inverted-graph topology and crossbar switch is a kind of lower latency and higher throughput interconnection network. But it has a vital disadvantage, high hardware complexity. In order to reduce system hardware cost, an improved NIN (ININ) structure is proposed. As same as NIN, ININ has constant network diameter. Besides of keeping advantages of NIN, hardware cost of ININ is lower than NIN. Furthermore, we design a new deadlock-free routing algorithm for the improved NIN. Key words NIN - ININ - inverted-graph interconnection network - hardware complexity - network band-width - network throughput CLC number TP 302 Foundation item: Supported by the National Natural Science Foundation of China (69873016)Biography: Li Fei (1974-), male, Ph. D candidate, research direction: architecture of interconnection network.展开更多
An important theoretic interest is to study the relations between different interconnection networks, and to compare the capability and performance of the network structures. The most popular way to do the investigati...An important theoretic interest is to study the relations between different interconnection networks, and to compare the capability and performance of the network structures. The most popular way to do the investigation is network emulation. Based on the classical voltage graph theory, the authors develop a new representation scheme for interconnection network structures. The new approach is a combination of algebraic methods and combinatorial methods. The results demonstrate that the voltage graph theory is a powerful tool for representing well known interconnection networks and in implementing optimal network emulation algorithms, and in particular, show that all popular interconnection networks have very simple and intuitive representations under the new scheme. The new representation scheme also offers powerful tools for the study of network routings and emulations. For example, we present very simple constructions for optimal network emulations from the cube connected cycles networks to the butterfly networks, and from the butterfly networks to the hypercube networks. Compared with the most popular way of network emulation, this new scheme is intuitive and easy to realize, and easy to apply to other network structures.展开更多
Some useful layered cross product decompositons are derived both for general bit permutation networks and for(2n-1)-stage multistage interconnection networks.Several issues in related works are clarified and the rearr...Some useful layered cross product decompositons are derived both for general bit permutation networks and for(2n-1)-stage multistage interconnection networks.Several issues in related works are clarified and the rearrangeability of some interesting networks are considered.In particular, the rearrangeability of one class of networks is formulated as a new type of combinatorial design problmes.展开更多
All-to-all personalized communication,or complete exchange,is at the heart of numerous applications in paral-lel computing.It is one of the most dense communication patterns.In this paper,we consider this problem in a...All-to-all personalized communication,or complete exchange,is at the heart of numerous applications in paral-lel computing.It is one of the most dense communication patterns.In this paper,we consider this problem in a2D/3D mesh and a multidimensional interconnection network with the wormhole-routing capability.We propose complete ex-change algorithms for them respectively.We propose O(mn 2 )phase algorithm for2D mesh P m ×P n and O(mn 2 l 2 )phase algo-rithm for3D mesh P m ×P n ×P l ,where m,n,l are any positive integers.Also O(ph(G 1 )n 2 )phase algorithm is proposed for a multidimensional interconnection network G 1 ×G 2 ,where ph(G 1 )stands for complete exchange phases of G 1 and|G 2 |=n.展开更多
A sorting algorithm based on the Batcher’s algorithm is presented. An 8×8 multistage interconnection network(MIN) is constructed. Applying wavelength division multiplexing(WDM) technology and integrating control...A sorting algorithm based on the Batcher’s algorithm is presented. An 8×8 multistage interconnection network(MIN) is constructed. Applying wavelength division multiplexing(WDM) technology and integrating control mode, the designed network can realize non-blocking communication. The time delay of the MIN and the switches needed are also analyzed in theory, the deduced result conforms that the MIN designed previously is feasible. In the case of the same communication quality guaranteed, MIN uses the least switches and completes the communication more efficiently.展开更多
Scalability is an important issue in the design of interconnection networks for massively parallel systems. In this paper a scalable class of interconnection network of Hex-Cell for massively parallel systems is intro...Scalability is an important issue in the design of interconnection networks for massively parallel systems. In this paper a scalable class of interconnection network of Hex-Cell for massively parallel systems is introduced. It is called Multilayer Hex-Cell (MLH). A node addressing scheme and routing algorithm are also presented and discussed. An interesting feature of the proposed MLH is that it maintains a constant network degree regardless of the increase in the network size degree which facilitates modularity in building blocks of scalable systems. The new addressing node scheme makes the proposed routing algorithm simple and efficient in terms of that it needs a minimum number of calculations to reach the destination node. Moreover, the diameter of the proposed MLH is less than Hex-Cell network.展开更多
In order to meet the pressing demand for wide-area communication required by the Global Energy Interconnection(GEI),accelerating the construction of satellite-terrestrial Integra怕d networks that can achieve network e...In order to meet the pressing demand for wide-area communication required by the Global Energy Interconnection(GEI),accelerating the construction of satellite-terrestrial Integra怕d networks that can achieve network extension and seamless global coverage has become the focus of power communication tech no logy development.In this study,we propose a satellite-terrestrial integrated network model that can support interconnection and interoperation on the IP layer between the satellite system and the怕rrestrial segment of the existing power communication system.First,the composition and function of the satellite-terrestrial collaborative network are explained.Then,the IP-based protocol stack is described,and a typical applicati on experime nt is con ducted to illustrate the particular process of this protocol stack.Fin ally,a use case of IP interconn ection that depends on GEO satellite communication is detailed.The experime ntal study has showed that the satellite-terrestrial collaborative network can efficiently support various IP applications for the GEI.展开更多
In order to extend the application scope of NDN and realize the transmission of different NDNs across IP networks,a method for interconnecting NDN networks distributed in different areas with IP networks is proposed.F...In order to extend the application scope of NDN and realize the transmission of different NDNs across IP networks,a method for interconnecting NDN networks distributed in different areas with IP networks is proposed.Firstly,the NDN data resource is located by means of the DNS mechanism,and the gateway IP address of the NDN network where the data resource is located is found.Then,the transmission between different NDNs across the IP network is implemented based on the tunnel technology.In addition,in order to achieve efficient and fast NDN data forwarding,we have added a small number of NDN service nodes in the IP network,and proposed an adaptive probabilistic forwarding strategy and a link cost function-based forwarding strategy to make NDN data obtaining the cache service provided by the NDN service node as much as possible.The results of analysis and simulation experiments show that,the interconnectionmethod of NDN across IP network proposed is generally effective and feasible,and the link cost function forwarding strategy is better than the adaptive probability forwarding strategy.展开更多
Recent architectures of multi-core systems may have a relatively large number of cores that typically ranges from tens to hundreds;therefore called many-core systems.Such systems require an efficient interconnection n...Recent architectures of multi-core systems may have a relatively large number of cores that typically ranges from tens to hundreds;therefore called many-core systems.Such systems require an efficient interconnection network that tries to address two major problems.First,the overhead of power and area cost and its effect on scalability.Second,high access latency is caused by multiple cores’simultaneous accesses of the same shared module.This paper presents an interconnection scheme called N-conjugate Shuffle Clusters(NCSC)based on multi-core multicluster architecture to reduce the overhead of the just mentioned problems.NCSC eliminated the need for router devices and their complexity and hence reduced the power and area costs.It also resigned and distributed the shared caches across the interconnection network to increase the ability for simultaneous access and hence reduce the access latency.For intra-cluster communication,Multi-port Content Addressable Memory(MPCAM)is used.The experimental results using four clusters and four cores each indicated that the average access latency for a write process is 1.14785±0.04532 ns which is nearly equal to the latency of a write operation in MPCAM.Moreover,it was demonstrated that the average read latency within a cluster is 1.26226±0.090591 ns and around 1.92738±0.139588 ns for read access between cores from different clusters.展开更多
In order to reduce the complexity of neural network connectivity,a dy-namical model for unfully interconnected neural network,including its energy func-tion,local area field and learning rule,is presented.The basic id...In order to reduce the complexity of neural network connectivity,a dy-namical model for unfully interconnected neural network,including its energy func-tion,local area field and learning rule,is presented.The basic idea is to decompose aHopfield network into several subnetworks and set up some interconnections betweenthem.The statistical analysis of the associative memory process shows that the num-ber of interconnections after the first decomposition is reduced by 25% comparedwith that of the Hopfield network,but the storage capacity and the associative abilityof the network remain unchanged.With the decomposition continued,the number ofinterconnections is considerably reduced.Despite the reduction in storage capacityand associative ability with continuous decomposition,the average information capac-ity per interconnection has increased nearly by 100%.Finally the relationship be-tween high-order interconnection and multilayer network architecture is discussed.展开更多
基金Supported by the National Natural Science Foundation of China(61103083,61133007)National High Technology Research and Development Program of China(863Program)(2012AA01A301,2015AA01A301)
文摘Most of users are accustomed to utilizing virtual address in their parallel programs running at the scalable high-performance parallel computing systems.Therefore a virtual and physical address translation mechanism is necessary and crucial to bridge the hardware interface and software application.In this paper,a new virtual and physical translation mechanism is proposed,which includes an address validity checker,an address translation cache(ATC),a complete refresh scheme and many reliability designs.The ATC employs a large capacity embedded dynamic random access memory(eDRAM)to meet the high hit ratio requirement.It also can switch the cache and buffer mode to avoid the high latency of accessing the main memory outside.Many tests have been conducted on the real chip,which implements the address translation mechanism.The results show that the ATC has a high hit ratio while running the well-known benchmarks,and additionally demonstrates that the new high-performance mechanism is well designed.
基金supported by National Natural Science Foundation of China (No.62362005)。
文摘The reliability of a network is an important indicator for maintaining communication and ensuring its stable operation. Therefore, the assessment of reliability in underlying interconnection networks has become an increasingly important research issue. However, at present, the reliability assessment of many interconnected networks is not yet accurate,which inevitably weakens their fault tolerance and diagnostic capabilities. To improve network reliability,researchers have proposed various methods and strategies for precise assessment. This paper introduces a novel family of interconnection networks called general matching composed networks(gMCNs), which is based on the common characteristics of network topology structure. After analyzing the topological properties of gMCNs, we establish a relationship between super connectivity and conditional diagnosability of gMCNs. Furthermore, we assess the reliability of g MCNs, and determine the conditional diagnosability of many interconnection networks.
基金This work was partially supported by the National High Technology Research and Development 863 Program of China under Grant No. 2012AA01A301 and the National Natural Science Foundation of China under Grant No. 61120106005. Acknowledgements The Tianhe-2 project is a great team effort and benefits from the cooperation of many individuals at NUDT. We would like to thank the entire Tianhe-2 development, applications, and bench- marking teams, and all the people who have contributed to the system in a variety of ways.
文摘In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and network interface chips, and highlight a set of hardware and software features effectively supporting high performance communications, ranging over remote direct memory access, collective optimization, hardwareenable reliable end-to-end communication, user-level message passing services, etc. Measured hardware performance results are also presented.
文摘In order to indicate the performances of a large-scale communication network with domain partition and interconnection today, a kind of reliability index weighed by normalized capacity is defined. Based on the route rules of network with domain partition and interconnection, the interconnection indexes among the nodes within the domain and among the domains are given from several aspects. It is expatiated on that the index can thoroughly represent the effect on the reliability index of the objective factor and the subjective measures of the designer, which obeys the route rules of a network with domain partition and interconnection. It is discussed that the defined index is rational and compatible with the traditional index.
文摘The Novel Interconnection Network (NIN) based on inverted-graph topology and crossbar switch is a kind of lower latency and higher throughput interconnection network. But it has a vital disadvantage, high hardware complexity. In order to reduce system hardware cost, an improved NIN (ININ) structure is proposed. As same as NIN, ININ has constant network diameter. Besides of keeping advantages of NIN, hardware cost of ININ is lower than NIN. Furthermore, we design a new deadlock-free routing algorithm for the improved NIN. Key words NIN - ININ - inverted-graph interconnection network - hardware complexity - network band-width - network throughput CLC number TP 302 Foundation item: Supported by the National Natural Science Foundation of China (69873016)Biography: Li Fei (1974-), male, Ph. D candidate, research direction: architecture of interconnection network.
基金TheNationalScienceFundforOverseasDistinguishedYoungScholars (No .6 992 82 0 1) ,FoundationforUniversityKeyTeacherbytheMinistryofEducationandChangjiangScholarRewardProject.
文摘An important theoretic interest is to study the relations between different interconnection networks, and to compare the capability and performance of the network structures. The most popular way to do the investigation is network emulation. Based on the classical voltage graph theory, the authors develop a new representation scheme for interconnection network structures. The new approach is a combination of algebraic methods and combinatorial methods. The results demonstrate that the voltage graph theory is a powerful tool for representing well known interconnection networks and in implementing optimal network emulation algorithms, and in particular, show that all popular interconnection networks have very simple and intuitive representations under the new scheme. The new representation scheme also offers powerful tools for the study of network routings and emulations. For example, we present very simple constructions for optimal network emulations from the cube connected cycles networks to the butterfly networks, and from the butterfly networks to the hypercube networks. Compared with the most popular way of network emulation, this new scheme is intuitive and easy to realize, and easy to apply to other network structures.
文摘Some useful layered cross product decompositons are derived both for general bit permutation networks and for(2n-1)-stage multistage interconnection networks.Several issues in related works are clarified and the rearrangeability of some interesting networks are considered.In particular, the rearrangeability of one class of networks is formulated as a new type of combinatorial design problmes.
文摘All-to-all personalized communication,or complete exchange,is at the heart of numerous applications in paral-lel computing.It is one of the most dense communication patterns.In this paper,we consider this problem in a2D/3D mesh and a multidimensional interconnection network with the wormhole-routing capability.We propose complete ex-change algorithms for them respectively.We propose O(mn 2 )phase algorithm for2D mesh P m ×P n and O(mn 2 l 2 )phase algo-rithm for3D mesh P m ×P n ×P l ,where m,n,l are any positive integers.Also O(ph(G 1 )n 2 )phase algorithm is proposed for a multidimensional interconnection network G 1 ×G 2 ,where ph(G 1 )stands for complete exchange phases of G 1 and|G 2 |=n.
基金Information Industry Bureau of Chongqing(200113010 and 200216006)
文摘A sorting algorithm based on the Batcher’s algorithm is presented. An 8×8 multistage interconnection network(MIN) is constructed. Applying wavelength division multiplexing(WDM) technology and integrating control mode, the designed network can realize non-blocking communication. The time delay of the MIN and the switches needed are also analyzed in theory, the deduced result conforms that the MIN designed previously is feasible. In the case of the same communication quality guaranteed, MIN uses the least switches and completes the communication more efficiently.
文摘Scalability is an important issue in the design of interconnection networks for massively parallel systems. In this paper a scalable class of interconnection network of Hex-Cell for massively parallel systems is introduced. It is called Multilayer Hex-Cell (MLH). A node addressing scheme and routing algorithm are also presented and discussed. An interesting feature of the proposed MLH is that it maintains a constant network degree regardless of the increase in the network size degree which facilitates modularity in building blocks of scalable systems. The new addressing node scheme makes the proposed routing algorithm simple and efficient in terms of that it needs a minimum number of calculations to reach the destination node. Moreover, the diameter of the proposed MLH is less than Hex-Cell network.
基金supported by the State Grid Science and Technology Project (No. 5455HT160004)
文摘In order to meet the pressing demand for wide-area communication required by the Global Energy Interconnection(GEI),accelerating the construction of satellite-terrestrial Integra怕d networks that can achieve network extension and seamless global coverage has become the focus of power communication tech no logy development.In this study,we propose a satellite-terrestrial integrated network model that can support interconnection and interoperation on the IP layer between the satellite system and the怕rrestrial segment of the existing power communication system.First,the composition and function of the satellite-terrestrial collaborative network are explained.Then,the IP-based protocol stack is described,and a typical applicati on experime nt is con ducted to illustrate the particular process of this protocol stack.Fin ally,a use case of IP interconn ection that depends on GEO satellite communication is detailed.The experime ntal study has showed that the satellite-terrestrial collaborative network can efficiently support various IP applications for the GEI.
基金supported by Beijing Advanced Innovation Center for Materials Genome Engineering,Beijing Information Science and Technology University。
文摘In order to extend the application scope of NDN and realize the transmission of different NDNs across IP networks,a method for interconnecting NDN networks distributed in different areas with IP networks is proposed.Firstly,the NDN data resource is located by means of the DNS mechanism,and the gateway IP address of the NDN network where the data resource is located is found.Then,the transmission between different NDNs across the IP network is implemented based on the tunnel technology.In addition,in order to achieve efficient and fast NDN data forwarding,we have added a small number of NDN service nodes in the IP network,and proposed an adaptive probabilistic forwarding strategy and a link cost function-based forwarding strategy to make NDN data obtaining the cache service provided by the NDN service node as much as possible.The results of analysis and simulation experiments show that,the interconnectionmethod of NDN across IP network proposed is generally effective and feasible,and the link cost function forwarding strategy is better than the adaptive probability forwarding strategy.
文摘Recent architectures of multi-core systems may have a relatively large number of cores that typically ranges from tens to hundreds;therefore called many-core systems.Such systems require an efficient interconnection network that tries to address two major problems.First,the overhead of power and area cost and its effect on scalability.Second,high access latency is caused by multiple cores’simultaneous accesses of the same shared module.This paper presents an interconnection scheme called N-conjugate Shuffle Clusters(NCSC)based on multi-core multicluster architecture to reduce the overhead of the just mentioned problems.NCSC eliminated the need for router devices and their complexity and hence reduced the power and area costs.It also resigned and distributed the shared caches across the interconnection network to increase the ability for simultaneous access and hence reduce the access latency.For intra-cluster communication,Multi-port Content Addressable Memory(MPCAM)is used.The experimental results using four clusters and four cores each indicated that the average access latency for a write process is 1.14785±0.04532 ns which is nearly equal to the latency of a write operation in MPCAM.Moreover,it was demonstrated that the average read latency within a cluster is 1.26226±0.090591 ns and around 1.92738±0.139588 ns for read access between cores from different clusters.
文摘In order to reduce the complexity of neural network connectivity,a dy-namical model for unfully interconnected neural network,including its energy func-tion,local area field and learning rule,is presented.The basic idea is to decompose aHopfield network into several subnetworks and set up some interconnections betweenthem.The statistical analysis of the associative memory process shows that the num-ber of interconnections after the first decomposition is reduced by 25% comparedwith that of the Hopfield network,but the storage capacity and the associative abilityof the network remain unchanged.With the decomposition continued,the number ofinterconnections is considerably reduced.Despite the reduction in storage capacityand associative ability with continuous decomposition,the average information capac-ity per interconnection has increased nearly by 100%.Finally the relationship be-tween high-order interconnection and multilayer network architecture is discussed.