The federated self-supervised framework is a distributed machine learning method that combines federated learning and self-supervised learning, which can effectively solve the problem of traditional federated learning...The federated self-supervised framework is a distributed machine learning method that combines federated learning and self-supervised learning, which can effectively solve the problem of traditional federated learning being difficult to process large-scale unlabeled data. The existing federated self-supervision framework has problems with low communication efficiency and high communication delay between clients and central servers. Therefore, we added edge servers to the federated self-supervision framework to reduce the pressure on the central server caused by frequent communication between both ends. A communication compression scheme using gradient quantization and sparsification was proposed to optimize the communication of the entire framework, and the algorithm of the sparse communication compression module was improved. Experiments have proved that the learning rate changes of the improved sparse communication compression module are smoother and more stable. Our communication compression scheme effectively reduced the overall communication overhead.展开更多
With large-scale development of distributed generation(DG) and its potential role in microgrids, the microgrid cluster(MGC) becomes a useful control model to assist the integration of DG. Considering that microgrids i...With large-scale development of distributed generation(DG) and its potential role in microgrids, the microgrid cluster(MGC) becomes a useful control model to assist the integration of DG. Considering that microgrids in a MGC, power dispatch optimization in a MGC is dif-ficult to achieve. In this paper, a hybrid interactive communication optimization solution(HICOS) is suggested based on flexible communication, which could be used to solve plug-in or plug-out operation states of microgrids in MGC power dispatch optimization. HICOS consists of a hierarchical architecture: the upper layer uses distributed control among multiple microgrids, with no central controller for the MGC, and the lower layer uses a central controller for each microgrid. Based on flexible communication links among microgrids, the optimal iterative information are exchanged among microgrids, thus HICOS would gradually converge to the global optimal solution.While some microgrids plug-in or plug-out, communication links will be changed, so as to unsuccessfully reach optimal solution. Differing from changeless communication links in traditional communication networks, HICOS redefines the topology of flexible communication links to meet the requirement to reach the global optimal solutions.Simulation studies show that HICOS could effectively reach the global optimal dispatch solution with non-MGC center. Especially, facing to microgrids plug-in or plug-out states, HICOS would also reach the global optimal solution based on refined communication link topology.展开更多
With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes ...With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes difficult.The multi-FPGA acceleration is of great necessity and importance.Many cloud providers(e.g.,Amazon,Microsoft,and Baidu)now expose FPGAs to users in their data centers,providing opportunities to accelerate large-scale graph processing.In this paper,we present a communication library,called FDGLib,which can easily scale out any existing single FPGA-based graph accelerator to a distributed version in a data center,with minimal hardware engineering efforts.FDGLib provides six APIs that can be easily used and integrated into any FPGA-based graph accelerator with only a few lines of code modifications.Considering the torus-based FPGA interconnection in data centers,FDGLib also improves communication efficiency using simple yet effective torus-friendly graph partition and placement schemes.We interface FDGLib into AccuGraph,a state-of-the-art graph accelerator.Our results on a 32-node Microsoft Catapult-like data center show that the distributed AccuGraph can be 2.32x and 4.77x faster than a state-of-the-art distributed FPGA-based graph accelerator ForeGraph and a distributed CPU-based graph system Gemini,with better scalability.展开更多
This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph partition problem and proved to be...This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph partition problem and proved to be NP-Complete. A heuristic algorithm MMP-Solver is then proposed to solve the problem. Experiment result shows that the task scheduling can reduce communication overhead of parallel applications greatly and MMP-Solver outperforms the existing algorithms.展开更多
文摘The federated self-supervised framework is a distributed machine learning method that combines federated learning and self-supervised learning, which can effectively solve the problem of traditional federated learning being difficult to process large-scale unlabeled data. The existing federated self-supervision framework has problems with low communication efficiency and high communication delay between clients and central servers. Therefore, we added edge servers to the federated self-supervision framework to reduce the pressure on the central server caused by frequent communication between both ends. A communication compression scheme using gradient quantization and sparsification was proposed to optimize the communication of the entire framework, and the algorithm of the sparse communication compression module was improved. Experiments have proved that the learning rate changes of the improved sparse communication compression module are smoother and more stable. Our communication compression scheme effectively reduced the overall communication overhead.
基金funded by the State Grid Corporation of China project:Cooperative Simulation of Power Grid and Communication Gridthe National Natural Science Funds 51407030China Postdoctoral Science Foundation 121809
文摘With large-scale development of distributed generation(DG) and its potential role in microgrids, the microgrid cluster(MGC) becomes a useful control model to assist the integration of DG. Considering that microgrids in a MGC, power dispatch optimization in a MGC is dif-ficult to achieve. In this paper, a hybrid interactive communication optimization solution(HICOS) is suggested based on flexible communication, which could be used to solve plug-in or plug-out operation states of microgrids in MGC power dispatch optimization. HICOS consists of a hierarchical architecture: the upper layer uses distributed control among multiple microgrids, with no central controller for the MGC, and the lower layer uses a central controller for each microgrid. Based on flexible communication links among microgrids, the optimal iterative information are exchanged among microgrids, thus HICOS would gradually converge to the global optimal solution.While some microgrids plug-in or plug-out, communication links will be changed, so as to unsuccessfully reach optimal solution. Differing from changeless communication links in traditional communication networks, HICOS redefines the topology of flexible communication links to meet the requirement to reach the global optimal solutions.Simulation studies show that HICOS could effectively reach the global optimal dispatch solution with non-MGC center. Especially, facing to microgrids plug-in or plug-out states, HICOS would also reach the global optimal solution based on refined communication link topology.
基金supported by the National Key Research and Development Program of China under Grant No.2018YFB1003502the National Natural Science Foundation of China under Grant Nos.62072195,61825202,61832006,and 61628204.
文摘With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes difficult.The multi-FPGA acceleration is of great necessity and importance.Many cloud providers(e.g.,Amazon,Microsoft,and Baidu)now expose FPGAs to users in their data centers,providing opportunities to accelerate large-scale graph processing.In this paper,we present a communication library,called FDGLib,which can easily scale out any existing single FPGA-based graph accelerator to a distributed version in a data center,with minimal hardware engineering efforts.FDGLib provides six APIs that can be easily used and integrated into any FPGA-based graph accelerator with only a few lines of code modifications.Considering the torus-based FPGA interconnection in data centers,FDGLib also improves communication efficiency using simple yet effective torus-friendly graph partition and placement schemes.We interface FDGLib into AccuGraph,a state-of-the-art graph accelerator.Our results on a 32-node Microsoft Catapult-like data center show that the distributed AccuGraph can be 2.32x and 4.77x faster than a state-of-the-art distributed FPGA-based graph accelerator ForeGraph and a distributed CPU-based graph system Gemini,with better scalability.
基金This work was supported by the National Natural Science Foundation of China (Grant No. 69933020) the "973" Program (Grant No. G1999032702).
文摘This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph partition problem and proved to be NP-Complete. A heuristic algorithm MMP-Solver is then proposed to solve the problem. Experiment result shows that the task scheduling can reduce communication overhead of parallel applications greatly and MMP-Solver outperforms the existing algorithms.