期刊文献+

大模型时代通信网络技术及新能力研究

Research on Technology and New Capabilities of Communication Networks in the Age of Large Models
下载PDF
导出
摘要 随着大模型的高速发展,智算需求的增长速度远超芯片性能提升速度,计算集群方案和“DC as a Computer”概念应运而生,数据中心网络变得尤为重要。在大模型训练和推理时,集群对网络系统的稳定性要求极高。针对大模型业务特点,结合主流集群网络技术,研究了训练场景下的超大规模组网、超高吞吐和超稳定的新一代智算中心网络技术,以及推理场景下通过SDN+SRv6可编程算网一体智能调度和切片技术构建高品质的入算网络,并研究了DC间协同训练的技术难点和应对方案。 With the rapid development of large models,the growth rate of intelligent computing demand far exceeds the speed of chip performance improvement.The computing cluster scheme and the concept of“DC as a Computer”emerges as a result,which makes the data center network become particularly important.During the training and inference of large models,clusters require extremely high stability of the network system.Based on the characteristics of large model services,and combined with the mainstream cluster network technology,it studies the new generation of intelligent computing center network technology in the training scenario of ultra-large scale networking,ultra-high throughput and ultra-stable,as well as the construction of high-quality computing networks through SDN+SRv6 programmable network integrated intelligent scheduling and slicing technology in inference scenarios,and the technical difficulties and countermeasures of DC collaborative training is also studied.
作者 陈斌 裴培 许鹏 Chen Bin;Pei Pei;Xu Peng(Intelligent Network&Innovation Center of China Unicom,Beijing 100048,China;China Information Technology Designing&Consulting Institute Co.,Ltd.,Beijing 100048,China)
出处 《邮电设计技术》 2024年第9期1-6,共6页 Designing Techniques of Posts and Telecommunications
关键词 广域网络 智算中心网络 带宽池化 跨集群模型训练 WAN Intelligent computing center network Bandwidth pooling Cross-DC collaborative training
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部