AI大模型正引领下一个十年的信息与通信技术(information and communications technology,ICT)产业发展热点。智算中心网络是支撑AI大模型分布式训练的通信底座,是决定AI集群效能的关键要素之一。AI大模型的数据量和参数量不断扩张,给...AI大模型正引领下一个十年的信息与通信技术(information and communications technology,ICT)产业发展热点。智算中心网络是支撑AI大模型分布式训练的通信底座,是决定AI集群效能的关键要素之一。AI大模型的数据量和参数量不断扩张,给智算中心网络带来了严峻的挑战,同时给关键网络技术进行代际性创新带来了机遇。在AI大模型训练和推理过程中,提供数据的高性能和高安全传输是AI业务对智算中心网络的两大核心需求。高效的负载均衡、拥塞控制技术和网络安全协议是其中的关键网络技术。为应对大规模AI业务带来的严峻挑战,提出全调度以太网(global scheduled Ethernet,GSE)作为对应的解决方案,并搭建真实的测试环境对GSE和RoCE(remote direct memory access over converged Ethernet)网络进行性能对比测试。测试结果证明,GSE相较RoCE网络显著改善了任务完成时间(job completion time,JCT)。展开更多
As massive distributed energy resources(DERs)are integrated into distribution networks(DNs)and the distribution automation facilities are widely deployed,the DNs are evolving to active distribution networks(ADNs).This...As massive distributed energy resources(DERs)are integrated into distribution networks(DNs)and the distribution automation facilities are widely deployed,the DNs are evolving to active distribution networks(ADNs).This paper introduces the architecture and main function modules of an integrated distribution management system(IDMS)and its applica-tions in China.This system consists of three subsystems,including the real-time operation and control system(OCS),outage management system(OMS),and operator training simulator(OTS).The OCS has a hierarchical architecture with three levels,including the local controller for DER clusters,the optimization of DNs incorporated with multi-clusters,and the coordina-tion operation of integrated transmission&distribution(T&D)networks.The OMS is developed based on the geographical information system(GIS)and coordinated with OCS.While in the OTS,both the ADN and its host transmission network(TN)are simulated to make the simulation results more credible.The main functions of the three subsystems and their interaction data flows are described and some typical application scenarios are also presented.展开更多
文摘AI大模型正引领下一个十年的信息与通信技术(information and communications technology,ICT)产业发展热点。智算中心网络是支撑AI大模型分布式训练的通信底座,是决定AI集群效能的关键要素之一。AI大模型的数据量和参数量不断扩张,给智算中心网络带来了严峻的挑战,同时给关键网络技术进行代际性创新带来了机遇。在AI大模型训练和推理过程中,提供数据的高性能和高安全传输是AI业务对智算中心网络的两大核心需求。高效的负载均衡、拥塞控制技术和网络安全协议是其中的关键网络技术。为应对大规模AI业务带来的严峻挑战,提出全调度以太网(global scheduled Ethernet,GSE)作为对应的解决方案,并搭建真实的测试环境对GSE和RoCE(remote direct memory access over converged Ethernet)网络进行性能对比测试。测试结果证明,GSE相较RoCE网络显著改善了任务完成时间(job completion time,JCT)。
基金the National Science Foundation of China(No.U2066601 and No.51725703).
文摘As massive distributed energy resources(DERs)are integrated into distribution networks(DNs)and the distribution automation facilities are widely deployed,the DNs are evolving to active distribution networks(ADNs).This paper introduces the architecture and main function modules of an integrated distribution management system(IDMS)and its applica-tions in China.This system consists of three subsystems,including the real-time operation and control system(OCS),outage management system(OMS),and operator training simulator(OTS).The OCS has a hierarchical architecture with three levels,including the local controller for DER clusters,the optimization of DNs incorporated with multi-clusters,and the coordina-tion operation of integrated transmission&distribution(T&D)networks.The OMS is developed based on the geographical information system(GIS)and coordinated with OCS.While in the OTS,both the ADN and its host transmission network(TN)are simulated to make the simulation results more credible.The main functions of the three subsystems and their interaction data flows are described and some typical application scenarios are also presented.