Multicomputer systems(distributed memory computer systems) are becoming more and more popular and will be wildly used in scientific researches. In this paper, we present a parallel algorithm of Fourier Transform of a ...Multicomputer systems(distributed memory computer systems) are becoming more and more popular and will be wildly used in scientific researches. In this paper, we present a parallel algorithm of Fourier Transform of a vector of complex numbers on multicomputer system and give its computing times and its speedup in parallel environment supported by EXPRESS system on the multicomputer system which consists of four SGI workstations. Our analysis shows that the results is ideal and this scheme is suitable to multicomputer systems.展开更多
Withthe rapiddevelopment of deep learning,the size of data sets anddeepneuralnetworks(DNNs)models are also booming.As a result,the intolerable long time for models’training or inference with conventional strategies c...Withthe rapiddevelopment of deep learning,the size of data sets anddeepneuralnetworks(DNNs)models are also booming.As a result,the intolerable long time for models’training or inference with conventional strategies can not meet the satisfaction of modern tasks gradually.Moreover,devices stay idle in the scenario of edge computing(EC),which presents a waste of resources since they can share the pressure of the busy devices but they do not.To address the problem,the strategy leveraging distributed processing has been applied to load computation tasks from a single processor to a group of devices,which results in the acceleration of training or inference of DNN models and promotes the high utilization of devices in edge computing.Compared with existing papers,this paper presents an enlightening and novel review of applying distributed processing with data and model parallelism to improve deep learning tasks in edge computing.Considering the practicalities,commonly used lightweight models in a distributed system are introduced as well.As the key technique,the parallel strategy will be described in detail.Then some typical applications of distributed processing will be analyzed.Finally,the challenges of distributed processing with edge computing will be described.展开更多
Dynamic distribution model is one of the best schemes for parallel volume rendering. How- ever, in homogeneous cluster system.since the granularity is traditionally identical, all processors communicate almost simulta...Dynamic distribution model is one of the best schemes for parallel volume rendering. How- ever, in homogeneous cluster system.since the granularity is traditionally identical, all processors communicate almost simultaneously and computation load may lose balance. Due to problems above, a dynamic distribution model with prime granularity for parallel computing is presented. Granularities of each processor are relatively prime, and related theories are introduced. A high parallel performance can be achieved by minimizing network competition and using a load balancing strategy that ensures all processors finish almost simultaneously. Based on Master-Slave-Gleaner ( MSG) scheme, the parallel Splatting Algorithm for volume rendering is used to test the model on IBM Cluster 1350 system. The experimental results show that the model can bring a considerable improvement in performance, including computation efficiency, total execution time, speed, and load balancing.展开更多
The main goal of distribution network(DN)expansion planning is essentially to achieve minimal investment con-strained by specified reliability requirements.The reliability-constrained distribution network planning(RcD...The main goal of distribution network(DN)expansion planning is essentially to achieve minimal investment con-strained by specified reliability requirements.The reliability-constrained distribution network planning(RcDNP)problem can be cast as an instance of mixed-integer linear programming(MILP)which involves ultra-heavy computation burden especially for large-scale DNs.In this paper,we propose a parallel computing based solution method for the RcDNP problem.The RcDNP is decomposed into a backbone grid and several lateral grid problems with coordination.Then,a parallelizable augmented Lagrangian algorithm with acceleration method is developed to solve the coordination planning problems.The lateral grid problems are solved in parallel through coordinating with the backbone grid planning problem.Gauss-Seidel iteration is adopted on the subset of the convex hull of the feasible region constructed by decomposition.Under mild conditions,the optimality and convergence of the proposed method are verified.Numerical tests show that the proposed method can significantly reduce the solution time and make the RcDNP applicable for real-worldproblems.展开更多
The paper describes the use of invented,developed,and tested in different countries of the high-level spatial grasp model and technology capable of solving important problems in large social systems,which may be repre...The paper describes the use of invented,developed,and tested in different countries of the high-level spatial grasp model and technology capable of solving important problems in large social systems,which may be represented as dynamic,self-evolving and distributed social networks.The approach allows us to find important solutions on a holistic level by spatial navigation and parallel pattern matching of social networks with active self-propagating scenarios represented in a special recursive language.This approach effectively hides inside the distributed and networked language implementation traditional system management routines,often providing hundreds of times shorter and simpler high-level solution code.The paper highlights the demands to efficient simulation of social systems,briefs the technology used,and provides some programming examples for solutions of practical problems.展开更多
This report presents the design and implementation of a Distributed Data Acquisition、 Monitoring and Processing System (DDAMAP)。It is assumed that operations of a factory are organized into two-levels: client machin...This report presents the design and implementation of a Distributed Data Acquisition、 Monitoring and Processing System (DDAMAP)。It is assumed that operations of a factory are organized into two-levels: client machines at plant-level collect real-time raw data from sensors and measurement instrumentations and transfer them to a central processor over the Ethernets, and the central processor handles tasks of real-time data processing and monitoring. This system utilizes the computation power of Intel T2300 dual-core processor and parallel computations supported by multi-threading techniques. Our experiments show that these techniques can significantly improve the system performance and are viable solutions to real-time high-speed data processing.展开更多
针对具有物理机制的分布式水文模型对大流域、长序列模拟计算时间长、模拟速度慢的问题,引入基于GPU的并行计算技术,实现分布式水文模型WEP-L(water and energy transfer processes in large river basins)产流过程的并行化。选择鄱阳...针对具有物理机制的分布式水文模型对大流域、长序列模拟计算时间长、模拟速度慢的问题,引入基于GPU的并行计算技术,实现分布式水文模型WEP-L(water and energy transfer processes in large river basins)产流过程的并行化。选择鄱阳湖流域为实验区,采用计算能力为8.6的NVIDIA RTX A4000对算法性能进行测试。研究表明:提出的基于GPU的分布式水文模型并行算法具有良好的加速效果,当线程总数越接近划分的子流域个数(计算任务量)时,并行性能越好,在实验流域WEP-L模型子流域单元为8712个时,加速比最大达到2.5左右;随着计算任务量的增加,加速比逐渐增大,当实验流域WEP-L模型子流域单元增加到24897个时,加速比能达到3.5,表明GPU并行算法在大尺度流域分布式水文模型计算中具有良好的发展潜力。展开更多
Most of the neural network architectures are based on human experience,which requires a long and tedious trial-and-error process.Neural architecture search(NAS)attempts to detect effective architectures without human ...Most of the neural network architectures are based on human experience,which requires a long and tedious trial-and-error process.Neural architecture search(NAS)attempts to detect effective architectures without human intervention.Evolutionary algorithms(EAs)for NAS can find better solutions than human-designed architectures by exploring a large search space for possible architectures.Using multiobjective EAs for NAS,optimal neural architectures that meet various performance criteria can be explored and discovered efficiently.Furthermore,hardware-accelerated NAS methods can improve the efficiency of the NAS.While existing reviews have mainly focused on different strategies to complete NAS,a few studies have explored the use of EAs for NAS.In this paper,we summarize and explore the use of EAs for NAS,as well as large-scale multiobjective optimization strategies and hardware-accelerated NAS methods.NAS performs well in healthcare applications,such as medical image analysis,classification of disease diagnosis,and health monitoring.EAs for NAS can automate the search process and optimize multiple objectives simultaneously in a given healthcare task.Deep neural network has been successfully used in healthcare,but it lacks interpretability.Medical data is highly sensitive,and privacy leaks are frequently reported in the healthcare industry.To solve these problems,in healthcare,we propose an interpretable neuroevolution framework based on federated learning to address search efficiency and privacy protection.Moreover,we also point out future research directions for evolutionary NAS.Overall,for researchers who want to use EAs to optimize NNs in healthcare,we analyze the advantages and disadvantages of doing so to provide detailed guidance,and propose an interpretable privacy-preserving framework for healthcare applications.展开更多
In this paper an attempt of employing network resources to solve a complex and time-consuming problem is presented. The global illumination problem is selected as the study objective. An improved density estimation al...In this paper an attempt of employing network resources to solve a complex and time-consuming problem is presented. The global illumination problem is selected as the study objective. An improved density estimation algorithm is first developed, in which the more inherent concurrency is explored. Then its parallel implementation by using a PVM mechanism and the running performance analysis are provided. The analysis results show the expected speed-up obtained and demonstrate that the PVM has good application prospects for parallel computation in a distributed network.展开更多
文摘Multicomputer systems(distributed memory computer systems) are becoming more and more popular and will be wildly used in scientific researches. In this paper, we present a parallel algorithm of Fourier Transform of a vector of complex numbers on multicomputer system and give its computing times and its speedup in parallel environment supported by EXPRESS system on the multicomputer system which consists of four SGI workstations. Our analysis shows that the results is ideal and this scheme is suitable to multicomputer systems.
基金supported by the Natural Science Foundation of Jiangsu Province of China under Grant No.BK20211284the Financial and Science Technology Plan Project of Xinjiang Production,Construction Corps under Grant No.2020DB005the National Natural Science Foundation of China under Grant Nos.61872219,62002276 and 62177014。
文摘Withthe rapiddevelopment of deep learning,the size of data sets anddeepneuralnetworks(DNNs)models are also booming.As a result,the intolerable long time for models’training or inference with conventional strategies can not meet the satisfaction of modern tasks gradually.Moreover,devices stay idle in the scenario of edge computing(EC),which presents a waste of resources since they can share the pressure of the busy devices but they do not.To address the problem,the strategy leveraging distributed processing has been applied to load computation tasks from a single processor to a group of devices,which results in the acceleration of training or inference of DNN models and promotes the high utilization of devices in edge computing.Compared with existing papers,this paper presents an enlightening and novel review of applying distributed processing with data and model parallelism to improve deep learning tasks in edge computing.Considering the practicalities,commonly used lightweight models in a distributed system are introduced as well.As the key technique,the parallel strategy will be described in detail.Then some typical applications of distributed processing will be analyzed.Finally,the challenges of distributed processing with edge computing will be described.
基金Supported by Natural Science Foundation of China ( No. 60373061).
文摘Dynamic distribution model is one of the best schemes for parallel volume rendering. How- ever, in homogeneous cluster system.since the granularity is traditionally identical, all processors communicate almost simultaneously and computation load may lose balance. Due to problems above, a dynamic distribution model with prime granularity for parallel computing is presented. Granularities of each processor are relatively prime, and related theories are introduced. A high parallel performance can be achieved by minimizing network competition and using a load balancing strategy that ensures all processors finish almost simultaneously. Based on Master-Slave-Gleaner ( MSG) scheme, the parallel Splatting Algorithm for volume rendering is used to test the model on IBM Cluster 1350 system. The experimental results show that the model can bring a considerable improvement in performance, including computation efficiency, total execution time, speed, and load balancing.
基金supported in part by the State Grid Science and Technology Program of China(No.5100-202121561A-0-5-SF).
文摘The main goal of distribution network(DN)expansion planning is essentially to achieve minimal investment con-strained by specified reliability requirements.The reliability-constrained distribution network planning(RcDNP)problem can be cast as an instance of mixed-integer linear programming(MILP)which involves ultra-heavy computation burden especially for large-scale DNs.In this paper,we propose a parallel computing based solution method for the RcDNP problem.The RcDNP is decomposed into a backbone grid and several lateral grid problems with coordination.Then,a parallelizable augmented Lagrangian algorithm with acceleration method is developed to solve the coordination planning problems.The lateral grid problems are solved in parallel through coordinating with the backbone grid planning problem.Gauss-Seidel iteration is adopted on the subset of the convex hull of the feasible region constructed by decomposition.Under mild conditions,the optimality and convergence of the proposed method are verified.Numerical tests show that the proposed method can significantly reduce the solution time and make the RcDNP applicable for real-worldproblems.
文摘The paper describes the use of invented,developed,and tested in different countries of the high-level spatial grasp model and technology capable of solving important problems in large social systems,which may be represented as dynamic,self-evolving and distributed social networks.The approach allows us to find important solutions on a holistic level by spatial navigation and parallel pattern matching of social networks with active self-propagating scenarios represented in a special recursive language.This approach effectively hides inside the distributed and networked language implementation traditional system management routines,often providing hundreds of times shorter and simpler high-level solution code.The paper highlights the demands to efficient simulation of social systems,briefs the technology used,and provides some programming examples for solutions of practical problems.
文摘This report presents the design and implementation of a Distributed Data Acquisition、 Monitoring and Processing System (DDAMAP)。It is assumed that operations of a factory are organized into two-levels: client machines at plant-level collect real-time raw data from sensors and measurement instrumentations and transfer them to a central processor over the Ethernets, and the central processor handles tasks of real-time data processing and monitoring. This system utilizes the computation power of Intel T2300 dual-core processor and parallel computations supported by multi-threading techniques. Our experiments show that these techniques can significantly improve the system performance and are viable solutions to real-time high-speed data processing.
文摘针对具有物理机制的分布式水文模型对大流域、长序列模拟计算时间长、模拟速度慢的问题,引入基于GPU的并行计算技术,实现分布式水文模型WEP-L(water and energy transfer processes in large river basins)产流过程的并行化。选择鄱阳湖流域为实验区,采用计算能力为8.6的NVIDIA RTX A4000对算法性能进行测试。研究表明:提出的基于GPU的分布式水文模型并行算法具有良好的加速效果,当线程总数越接近划分的子流域个数(计算任务量)时,并行性能越好,在实验流域WEP-L模型子流域单元为8712个时,加速比最大达到2.5左右;随着计算任务量的增加,加速比逐渐增大,当实验流域WEP-L模型子流域单元增加到24897个时,加速比能达到3.5,表明GPU并行算法在大尺度流域分布式水文模型计算中具有良好的发展潜力。
基金supported in part by the National Natural Science Foundation of China (NSFC) under Grant No.61976242in part by the Natural Science Fund of Hebei Province for Distinguished Young Scholars under Grant No.F2021202010+2 种基金in part by the Fundamental Scientific Research Funds for Interdisciplinary Team of Hebei University of Technology under Grant No.JBKYTD2002funded by Science and Technology Project of Hebei Education Department under Grant No.JZX2023007supported by 2022 Interdisciplinary Postgraduate Training Program of Hebei University of Technology under Grant No.HEBUT-YXKJC-2022122.
文摘Most of the neural network architectures are based on human experience,which requires a long and tedious trial-and-error process.Neural architecture search(NAS)attempts to detect effective architectures without human intervention.Evolutionary algorithms(EAs)for NAS can find better solutions than human-designed architectures by exploring a large search space for possible architectures.Using multiobjective EAs for NAS,optimal neural architectures that meet various performance criteria can be explored and discovered efficiently.Furthermore,hardware-accelerated NAS methods can improve the efficiency of the NAS.While existing reviews have mainly focused on different strategies to complete NAS,a few studies have explored the use of EAs for NAS.In this paper,we summarize and explore the use of EAs for NAS,as well as large-scale multiobjective optimization strategies and hardware-accelerated NAS methods.NAS performs well in healthcare applications,such as medical image analysis,classification of disease diagnosis,and health monitoring.EAs for NAS can automate the search process and optimize multiple objectives simultaneously in a given healthcare task.Deep neural network has been successfully used in healthcare,but it lacks interpretability.Medical data is highly sensitive,and privacy leaks are frequently reported in the healthcare industry.To solve these problems,in healthcare,we propose an interpretable neuroevolution framework based on federated learning to address search efficiency and privacy protection.Moreover,we also point out future research directions for evolutionary NAS.Overall,for researchers who want to use EAs to optimize NNs in healthcare,we analyze the advantages and disadvantages of doing so to provide detailed guidance,and propose an interpretable privacy-preserving framework for healthcare applications.
文摘In this paper an attempt of employing network resources to solve a complex and time-consuming problem is presented. The global illumination problem is selected as the study objective. An improved density estimation algorithm is first developed, in which the more inherent concurrency is explored. Then its parallel implementation by using a PVM mechanism and the running performance analysis are provided. The analysis results show the expected speed-up obtained and demonstrate that the PVM has good application prospects for parallel computation in a distributed network.