期刊文献+
共找到8篇文章
< 1 >
每页显示 20 50 100
Edge-Federated Self-Supervised Communication Optimization Framework Based on Sparsification and Quantization Compression
1
作者 Yifei Ding 《Journal of Computer and Communications》 2024年第5期140-150,共11页
The federated self-supervised framework is a distributed machine learning method that combines federated learning and self-supervised learning, which can effectively solve the problem of traditional federated learning... The federated self-supervised framework is a distributed machine learning method that combines federated learning and self-supervised learning, which can effectively solve the problem of traditional federated learning being difficult to process large-scale unlabeled data. The existing federated self-supervision framework has problems with low communication efficiency and high communication delay between clients and central servers. Therefore, we added edge servers to the federated self-supervision framework to reduce the pressure on the central server caused by frequent communication between both ends. A communication compression scheme using gradient quantization and sparsification was proposed to optimize the communication of the entire framework, and the algorithm of the sparse communication compression module was improved. Experiments have proved that the learning rate changes of the improved sparse communication compression module are smoother and more stable. Our communication compression scheme effectively reduced the overall communication overhead. 展开更多
关键词 communication Optimization Federated Self-Supervision Sparsification Gradient Compression Edge Computing
下载PDF
Parallel communication optimization based on least square method for LBM
2
作者 曹啸鹏 武频 +1 位作者 尚伟烈 郑德群 《Journal of Shanghai University(English Edition)》 CAS 2011年第5期415-419,共5页
Efficient communication is important to every parallel algorithm. A parallel communication optimization is introduced into lattice Boltzmann method (LBM). It relies on a simplified communication strategy which is im... Efficient communication is important to every parallel algorithm. A parallel communication optimization is introduced into lattice Boltzmann method (LBM). It relies on a simplified communication strategy which is implemented by least square method. After testing the improved algorithm on parallel platform, the experimental results show that compared with normal parallel lattice Boltzmann algorithm, it provides better stability, higher performance while maintaining the same accuracy. 展开更多
关键词 lattice Boltzmann method (LBM) parallel computing communication optimization
下载PDF
Plug-in and plug-out dispatch optimization in microgrid clusters based on flexible communication 被引量:6
3
作者 Jie YU Ming NI +1 位作者 Yiping JIAO Xiaolong WANG 《Journal of Modern Power Systems and Clean Energy》 SCIE EI 2017年第4期663-670,共8页
With large-scale development of distributed generation(DG) and its potential role in microgrids, the microgrid cluster(MGC) becomes a useful control model to assist the integration of DG. Considering that microgrids i... With large-scale development of distributed generation(DG) and its potential role in microgrids, the microgrid cluster(MGC) becomes a useful control model to assist the integration of DG. Considering that microgrids in a MGC, power dispatch optimization in a MGC is dif-ficult to achieve. In this paper, a hybrid interactive communication optimization solution(HICOS) is suggested based on flexible communication, which could be used to solve plug-in or plug-out operation states of microgrids in MGC power dispatch optimization. HICOS consists of a hierarchical architecture: the upper layer uses distributed control among multiple microgrids, with no central controller for the MGC, and the lower layer uses a central controller for each microgrid. Based on flexible communication links among microgrids, the optimal iterative information are exchanged among microgrids, thus HICOS would gradually converge to the global optimal solution.While some microgrids plug-in or plug-out, communication links will be changed, so as to unsuccessfully reach optimal solution. Differing from changeless communication links in traditional communication networks, HICOS redefines the topology of flexible communication links to meet the requirement to reach the global optimal solutions.Simulation studies show that HICOS could effectively reach the global optimal dispatch solution with non-MGC center. Especially, facing to microgrids plug-in or plug-out states, HICOS would also reach the global optimal solution based on refined communication link topology. 展开更多
关键词 Microgrid Cluster(MGC) Plug-in/plug-out dispatch optimization Hybrid interactive communication optimization solution(HICOS) Flexible communication
原文传递
FDGLib: A Communication Library for Efficient Large-Scale Graph Processing in FPGA-Accelerated Data Centers
4
作者 Yu-Wei Wu Qing-Gang Wang +5 位作者 Long Zheng Xiao-Fei Liao Hai Jin Wen-Bin Jiang Ran Zheng Kan Hu 《Journal of Computer Science & Technology》 SCIE EI CSCD 2021年第5期1051-1070,共20页
With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes ... With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes difficult.The multi-FPGA acceleration is of great necessity and importance.Many cloud providers(e.g.,Amazon,Microsoft,and Baidu)now expose FPGAs to users in their data centers,providing opportunities to accelerate large-scale graph processing.In this paper,we present a communication library,called FDGLib,which can easily scale out any existing single FPGA-based graph accelerator to a distributed version in a data center,with minimal hardware engineering efforts.FDGLib provides six APIs that can be easily used and integrated into any FPGA-based graph accelerator with only a few lines of code modifications.Considering the torus-based FPGA interconnection in data centers,FDGLib also improves communication efficiency using simple yet effective torus-friendly graph partition and placement schemes.We interface FDGLib into AccuGraph,a state-of-the-art graph accelerator.Our results on a 32-node Microsoft Catapult-like data center show that the distributed AccuGraph can be 2.32x and 4.77x faster than a state-of-the-art distributed FPGA-based graph accelerator ForeGraph and a distributed CPU-based graph system Gemini,with better scalability. 展开更多
关键词 data center ACCELERATOR graph processing distributed architecture communication optimization
原文传递
Interpolation oriented parallel communication to optimize coupling in earth system modeling
5
作者 Yingsheng JI Yingzhuo ZHANG Guangwen YANG 《Frontiers of Computer Science》 SCIE EI CSCD 2014年第4期693-708,共16页
Complicated global climate problems trigger researchers from different scientific disciplines to link multiphysics simulations called models for integrated modeling of climate changes by using a software framework cal... Complicated global climate problems trigger researchers from different scientific disciplines to link multiphysics simulations called models for integrated modeling of climate changes by using a software framework called earth system modeling (ESM). As its critical component, coupler is in charge of connections and interactions among models. With the advance of next-generation models, greater data transfer volume and higher coupling frequency are expected to put heavy performance burden on coupler. High efficient coupling techniques are required. In this paper, we propose the sub-domain mapping method to improve the parallel coupling consisted of data transfer and data transformation. By using one specific interpolation oriented communication routing, the communication operations that are originally decentralized in various steps can be combined together for execution. This can reduce the redundant communications and the entailed synchronization costs. The tests on the Tianhe-lA (TH-1A) supercomputer show that our method can achieve 1.1 to 4.9 fold performance improve- ments. We also present further optimization solution for the multi-interpolation cases. The test results show that our method can achieve up to 3.4 fold speedup over the original coupling execution of the current climate system. 展开更多
关键词 COUPLER communication optimization coupling performance ESM
原文传递
Communication Optimization for SMP Clusters
6
作者 林伟坚 陈文光 +1 位作者 李志光 郑纬民 《Tsinghua Science and Technology》 SCIE EI CAS 2001年第1期18-23,41,共7页
Shared Memory Processors (SMP) workstation clusters are becoming more and more popular. To optimize communication between the workstations, a new graph partition problem was developed to schedule tasks in SMP clusters... Shared Memory Processors (SMP) workstation clusters are becoming more and more popular. To optimize communication between the workstations, a new graph partition problem was developed to schedule tasks in SMP clusters. The problem is NP-complete and a heuristic algorithm was developed based on Lee, Kim and Park's algorithm. Experimental results indicate that our algorithm outperforms theirs, especially when the number of partitions is large. This algorithm can be integrated in a parallelizing compiler as a back end optimizer for the distributed code generator. 展开更多
关键词 SMP cluster communication optimization task scheduling
原文传递
Task scheduling of parallel programs to optimize communications for cluster of SMPs
7
作者 郑纬民 杨博 +1 位作者 林伟坚 李志光 《Science in China(Series F)》 2001年第3期213-225,共13页
This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph parti-tion problem and proved to b... This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph parti-tion problem and proved to be NP-Complete. A heuristic algorithm MMP-Solver is then proposed to solve the problem. Experiment result shows that the task scheduling can reduce communication over-head of parallel applications greatly and MMP-Solver outperforms the existing algorithms. 展开更多
关键词 SMP cluster of workstations communication optimization task scheduling graph partition parallelizing compiler.
原文传递
Crystal-KMC: Parallel Software for Lattice Dynamics Monte Carlo Simulation of Metal Materials 被引量:2
8
作者 Jianjiang Li Peng Wei +3 位作者 Shaofeng Yang Jie Wu Peng Liu Xinfu He 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2018年第4期501-510,共10页
Kinetic Monte Carlo (KMC) is a widely used method for studying the evolution of materials at the microcosmic level. At present, while there are many simulation software programs based on this algorithm, most focus o... Kinetic Monte Carlo (KMC) is a widely used method for studying the evolution of materials at the microcosmic level. At present, while there are many simulation software programs based on this algorithm, most focus on the verification of a certain phenomenon and have no analog-scale requirement, so many are serial in nature. The dynamic Monte Carlo algorithm is implemented using a parallel framework called SPPARKS, but Jt does not support the Embedded Atom Method (EAM) potential, which is commonly used in the dynamic simulation of metal materials. Metal material - the preferred material for most containers and components -- plays an important role in many fields, including construction engineering and transportation. In this paper, we propose and describe the development of a parallel software program called CrystaI-KMC, which is specifically used to simulate the lattice dynamics of metallic materials. This software uses MPI to achieve a parallel multiprocessing mode, which avoid the limitations of serial software in the analog scale. Finally, we describe the use of the paralleI-KMC simulation software CrystaI-KMC in simulating the diffusion of vacancies in iron, and analyze the experimental results. In addition, we tested the performance of CrystaI-KMC in "meta -Era" supercomputing clusters, and the results show the CrystaI-KMC parallel software to have good parallel speedup and scalability. 展开更多
关键词 Kinetic Monte Carlo (KMC) communication optimization parallel computation Message PassingInterface (MPI)
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部