摘要
人工智能训练的计算复杂度逐年猛增,所需的智能算力从每秒千万次运算增加到数百亿次,甚至进入千亿级别,促进了大规模智算中心的建设。智算中心主要满足智算算力的需求,其布局、建设及维护方案与传统的云资源池存在较大差异,当前运营商对智算中心的布局以及详细方案并没有统一的建议和参考。分析了大模型发展带来的算力、存储、组网的需求挑战,对运营商智算布局以及算力、存储、网络、维护管理等提出了相应的策略和方案建议。
The computational complexity of AI training has been increasing dramatically year by year,with the required intelligent computing power growing from hundreds of millions of operations per second to tens of billions,and even reaching the level of hundreds of billions,which promotes the construction of large-scale intelligent computing centers.These centers are primarily built to meet the demand for intelligent computing power,and they differ significantly from traditional cloud resource pools in terms of layout,construction,and maintenance solutions.The industry currently lacks unified recommendations and references for the layout and detailed plans of intelligent computing centers.It analyzes the challenges of computing power,storage,and networking demands brought by the development of large models,and proposes corresponding strategies and suggestions for the layout of operators'intelligent computing,as well as computing power,storage,network,and maintenance management.
作者
童俊杰
申佳
赫罡
张奎
Tong Junjie;Shen Jia;He Gang;Zhang Kui(China United Network Communications Group Co.,Ltd.,Beijing 100033,China;China Information Technology Designing&Consulting Institute Co.,Ltd.Zhengzhou Branch,Zhengzhou 450007,China)
出处
《邮电设计技术》
2024年第9期68-73,共6页
Designing Techniques of Posts and Telecommunications
关键词
人工智能
智算中心
基础设施
建设思路
Artificial intelligence
Intelligent computing center
Infrastructure
Construction ideas