期刊文献+

神威E级原型机互连网络和消息机制 被引量:8

The Interconnection Network and Message Machinasim of Sunway Exascale Prototype System
下载PDF
导出
摘要 本文描述了神威E级原型机的互连网络和消息机制.神威E级原型机是继神威蓝光、神威·太湖之光之后神威家族的第三代计算机.该计算机作为一台E级计算机的原型机,峰值性能3.13 PFlops,其最大的特色之一就是采用28 Gbps传输技术,设计开发了新一代的神威高阶路由器和神威高性能网络接口两款芯片,在传统胖树的基础上,设计了双轨泛树拓扑结构,定义实现了新颖的神威消息原语和消息库,实现了一种基于包级粒度动态切换的双轨乱序消息机制,通信性能比神威·太湖之光互连网络提升了4倍,为神威E级计算机互连网络的研制奠定了基础. The high-performance interconnection network is one of the main components of the high-performance computing system.It is responsible for the connection of computing nodes,storage nodes,and I/O devices in the high-performance computing system,and is responsible for the communication of all nodes in the high-performance computing system.There are a large number of parallel applications in high-performance computing systems that need to exchange data between different nodes(between computing nodes,between computing nodes and IO nodes,between computing nodes and storage nodes).High requirements are put forward for the communication delay and bandwidth of high-performance interconnection networks.A large number of high-performance computing systems have adopted customized interconnection networks to meet application requirements.The customized interconnection network can well meet the design requirements of high performance computing system,and can optimize the design of network performance such as communication delay and communication bandwidth to better meet the various communication requirements of high-performance computing systems and improve communication performance,thereby improving the actual operating performance of parallel applications in high-performance computing systems.Interconnection network design is an important means to improve network communication performance.At the same time,the message mechanism has a huge influence on communication performance.Even under the same topology and router conditions,different message mechanisms will still cause huge differences in communication performance.The customized features of customized networks are largely reflected in the ability to customize various message mechanisms.Each customized network has its own message mechanism and defines its own message protocol to meet its own special communication needs.The high-performance interconnection network and message mechanism are studied on the purpose of independent control.The communication performance must match the fast developing computing capability on the road to exascale system.The worldwide top supercomputers mainly select Mellanox InfiniBand,Cray Aries,Intel Onmi-path,and employ the 25 Gbps transmission technique to implement their interconnection network.The networks of the top domestic supercomputer,such as“Sunway Taihu Light”and“Tianhe 2”,are constructed based on 14 Gbps transmission.The interconnection network and message mechanism of the Sunway exascale prototype system are introduced in this paper.Sunway exascale prototype system is the third-generation supercomputer of Sunway supercomputer family,after Sunway Blue Light and Sunway Taihu Light.As a pre-research project for the exascale system,the peak performance of this system is up to 3.13 PFlops.The interconnection network of this system is constructed based on two innovative Sunway chips:the Sunway high-radix router chip and Sunway high-performance network interface chip,depending on the 28 Gbps transmission technique.Moreover,a generalized fat-tree network topology is developed;an out-of-order message mechanism with dynamic packet-interleaved transmission in two rails is implemented;the efficient Sunway message verbs and library are designed.The communication performance of the prototype system improves 4 times compared with Sunway Taihu Light,and it therefore makes the solid technology foundation for Sunway exascale system.Sunway exascale prototype system makes the break-through on the key technologies of 28 Gbps transmission,high-radix router,high-performance network interface,high-efficient and reliable network architecture.Furthermore,Sunway network chipset of new generation is designed,and the network of Sunway exascale prototype system is constructed.They all contribute to the design of the domestic exascale supercomputer.The research achieves the goal of innovative design of the exascale system by constructing the large-scale verification system,mastering the techniques of new interconnection network architecture,and testing based on domestic components and parts.
作者 高剑刚 卢宏生 何王全 任秀江 陈淑平 斯添浩 周舟 胡舒凯 于康 魏迪 GAO Jian-Gang;LU Hong-Sheng;HE Wang-Quan;REN Xiu-Jiang;CHEN Shu-Ping;SI Tian-Hao;ZHOU Zhou;HU Shu-Kai;YU Kang;WEI Di(National Research Center of Parallel Computer Engineering and Technology,Beijing 100190)
出处 《计算机学报》 EI CSCD 北大核心 2021年第1期222-234,共13页 Chinese Journal of Computers
基金 国家重点研发计划项目(2016YFB0200500)资助
关键词 多轨网络 泛树 高阶路由器 路由算法 网络接口 消息引擎 消息库 multi-rail network generalized fat-tree topology high-radix router chip routing arithmetic network interface message engine message library
  • 相关文献

参考文献1

二级参考文献1

共引文献18

同被引文献32

引证文献8

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部