摘要
随着人工智能时代的加速到来,高性能计算、人工智能计算等多元化算力成为当前数字信息社会发展的生产力核心,通过搭建算力网络能够充分调动分布式算力中心资源。华中科技大学通过整合校内多个算力中心,实现算力资源的统一调度管理。文章阐述跨域算力集群的软硬件环境,并基于PyTorch框架对比各个算力中心的AI性能。实验结果证明,平台支持AI算力的跨域调用,采用一定优化方案后,可以完全解决跨域数据传输带来的性能瓶颈。
With the accelerated arrival of the era of Artificial Intelligence,high-performance computing,Artificial Intelligence computing and other diversified computing power has become the productivity core of the current development of digital information society,and the construction of computing power network can fully mobilize the resources of distributed computing power center.Huazhong University of Science and Technology realizes the unified scheduling and management of computing power resources by integrating multiple computing power centers in the university.This paper describes the hardware and software environment of cross-domain computing power cluster,and compares the AI performance of each computing power center based on PyTorch framework.The experimental results show that the platform supports the cross-domain call of AI computing power,and the performance bottleneck caused by cross-domain data transmission can be completely solved by adopting certain optimization schemes.
作者
张策
张凯祯
龙涛
ZHANG Ce;ZHANG Kaizhen;LONG Tao(Network and Computing Center,Huazhong University of Science and Technology,Wuhan 430074,China)
出处
《现代信息科技》
2024年第14期43-48,共6页
Modern Information Technology
关键词
人工智能
高性能计算
跨域算力
PyTorch
Artificial Intelligence
high performance computing
cross-domain computing power
PyTorch