摘要
随着ChatGPT的诞生,AI大模型相关的产品和服务呈爆发式增长,国内互联网企业和运营商等纷纷加入“百模大战”。现有云平台中智能服务器采用传统GPU配置规格,无法满足大模型训练所需要的高性能算力和卡间互联带宽需求。本文主要研究扣卡模组形态的智能计算芯片技术,以及整机服务器的散热技术等内容,为后续建设智算中心提供选型依据。
With created of ChatGPT,products and services related to AI large-scale models have exploded.Domestic Internet companies,telecom operators,and others have joined the“hundred-model war”.The intelligent servers in existing cloud platforms use traditional GPU configuration specifi cations,which unable to meet the highperformance computing power and GPUs interconnect bandwidth requirements required for large-scale model training.This article mainly studies the intelligent computing chip technology of buckle module form and the cooling technology of the whole computing server,and provides a selection basis for subsequent construction of the intelligent computing center.
作者
张鹏飞
田雯
武振宇
贾凡
刘长瑞
施子墨
ZHANG Peng-fei;TIAN Wen;WU Zhen-yu;JIA Fan;LIU Chang-rui;SHI Zi-mo(China Mobile Group Co.,Ltd.,Beijing 100033,China;China Mobile Group Design Institute Co.,Ltd.,Beijing 100080,China)
出处
《电信工程技术与标准化》
2024年第1期3-7,共5页
Telecom Engineering Technics and Standardization
关键词
智算中心
模型训练
扣卡模组机型
液冷散热
intelligent computing center
model training
buckle module form
liquid cooling