期刊文献+

Adaptive Load Balancing for Parameter Servers in Distributed Machine Learning over Heterogeneous Networks 被引量:1

下载PDF
导出
摘要 In distributed machine learning(DML)based on the parameter server(PS)architecture,unbalanced communication load distribution of PSs will lead to a significant slowdown of model synchronization in heterogeneous networks due to low utilization of bandwidth.To address this problem,a network-aware adaptive PS load distribution scheme is proposed,which accelerates model synchronization by proactively adjusting the communication load on PSs according to network states.We evaluate the proposed scheme on MXNet,known as a realworld distributed training platform,and results show that our scheme achieves up to 2.68 times speed-up of model training in the dynamic and heterogeneous network environment.
出处 《ZTE Communications》 2023年第1期72-80,共9页 中兴通讯技术(英文版)
基金 partially supported by the computing power networks and new communication primitives project under Grant No. HC-CN-2020120001 the National Natural Science Foundation of China under Grant No. 62102066 Open Research Projects of Zhejiang Lab under Grant No. 2022QA0AB02
  • 相关文献

同被引文献8

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部