This study explores the application of parallel algorithms to enhance large-scale sorting, focusing on the QuickSort method. Implemented in both sequential and parallel forms, the paper provides a detailed comparison ...This study explores the application of parallel algorithms to enhance large-scale sorting, focusing on the QuickSort method. Implemented in both sequential and parallel forms, the paper provides a detailed comparison of their performance. This study investigates the efficacy of both techniques through the lens of array generation and pivot selection to manage datasets of varying sizes. This study meticulously documents the performance metrics, recording 16,499.2 milliseconds for the serial implementation and 16,339 milliseconds for the parallel implementation when sorting an array by using C++ chrono library. These results suggest that while the performance gains of the parallel approach over its serial counterpart are not immediately pronounced for smaller datasets, the benefits are expected to be more substantial as the dataset size increases.展开更多
The meta-heuristic algorithm is a global probabilistic search algorithm for the iterative solution.It has good performance in global optimization fields such as maximization.In this paper,a new adaptive parameter stra...The meta-heuristic algorithm is a global probabilistic search algorithm for the iterative solution.It has good performance in global optimization fields such as maximization.In this paper,a new adaptive parameter strategy and a parallel communication strategy are proposed to further improve the Cuckoo Search(CS)algorithm.This strategy greatly improves the convergence speed and accuracy of the algorithm and strengthens the algorithm’s ability to jump out of the local optimal.This paper compares the optimization performance of Parallel Adaptive Cuckoo Search(PACS)with CS,Parallel Cuckoo Search(PCS),Particle Swarm Optimization(PSO),Sine Cosine Algorithm(SCA),Grey Wolf Optimizer(GWO),Whale Optimization Algorithm(WOA),Differential Evolution(DE)and Artificial Bee Colony(ABC)algorithms by using the CEC-2013 test function.The results show that PACS algorithmoutperforms other algorithms in 20 of 28 test functions.Due to the superior performance of PACS algorithm,this paper uses it to solve the problem of the rectangular layout.Experimental results show that this scheme has a significant effect,and the material utilization rate is improved from89.5%to 97.8%after optimization.展开更多
A distribution network plays an extremely important role in the safe and efficient operation of a power grid.As the core part of a power grid’s operation,a distribution network will have a significant impact on the s...A distribution network plays an extremely important role in the safe and efficient operation of a power grid.As the core part of a power grid’s operation,a distribution network will have a significant impact on the safety and reliability of residential electricity consumption.it is necessary to actively plan and modify the distribution network’s structure in the power grid,improve the quality of the distribution network,and optimize the planning of the distribution network,so that the network can be fully utilized to meet the needs of electricity consumption.In this paper,a distribution network grid planning algorithm based on the reliability of electricity consumption was completed using ant colony algorithm.For the distribution network structure planning of dual power sources,the parallel ant colony algorithm was used to prove that the premise of parallelism is the interactive process of ant colonies,and the dual power distribution network structure model is established based on the principle of the lowest cost.The artificial ants in the algorithm were compared with real ants in nature,and the basic steps and working principle of the ant colony optimization algorithm was studied with the help of the travelling salesman problem(TSP).Then,the limitations of the ant colony algorithm were analyzed,and an improvement strategy was proposed by using python for digital simulation.The results demonstrated the reliability of model-building and algorithm improvement.展开更多
The method of establishing data structures plays an important role in the efficiency of parallel multilevel fast multipole algorithm(PMLFMA).Considering the main complements of multilevel fast multipole algorithm(M...The method of establishing data structures plays an important role in the efficiency of parallel multilevel fast multipole algorithm(PMLFMA).Considering the main complements of multilevel fast multipole algorithm(MLFMA) memory,a new parallelization strategy and a modified data octree construction scheme are proposed to further reduce communication in order to improve parallel efficiency.For far interaction,a new scheme called dynamic memory allocation is developed.To analyze the workload balancing performance of a parallel implementation,the original concept of workload balancing factor is introduced and verified by numerical examples.Numerical results show that the above measures improve the parallel efficiency and are suitable for the analysis of electrical large-scale scattering objects.展开更多
This paper considers adaptive control of parallel manipulators combined with fuzzy-neural network algorithms (FNNA). With this algorithm, the robustness is guaranteed by the adaptive control law and the parametric u...This paper considers adaptive control of parallel manipulators combined with fuzzy-neural network algorithms (FNNA). With this algorithm, the robustness is guaranteed by the adaptive control law and the parametric uncertainties are eliminated. FNNA is used to handle model uncertainties and external disturbances. In the proposed control scheme, we consider modifying the weight of fuzzy rules and present these rules to a MIMO system of parallel manipulators with more than three degrees-of-freedom (DoF). The algorithm has the advantage of not requiring the inverse of the Jacobian matrix especially for the low DoF parallel manipulators. The validity of the control scheme is shown through numerical simulations of a 6-RPS parallel manipulator with three DoF.展开更多
Dimensional synthesis is one of the most difficult issues in the field of parallel robots with actuation redundancy. To deal with the optimal design of a redundantly actuated parallel robot used for ankle rehabilitati...Dimensional synthesis is one of the most difficult issues in the field of parallel robots with actuation redundancy. To deal with the optimal design of a redundantly actuated parallel robot used for ankle rehabilitation, a methodology of dimensional synthesis based on multi-objective optimization is presented. First, the dimensional synthesis of the redundant parallel robot is formulated as a nonlinear constrained multi-objective optimization problem. Then four objective functions, separately reflecting occupied space, input/output transmission and torque performances, and multi-criteria constraints, such as dimension, interference and kinematics, are defined. In consideration of the passive exercise of plantar/dorsiflexion requiring large output moment, a torque index is proposed. To cope with the actuation redundancy of the parallel robot, a new output transmission index is defined as well. The multi-objective optimization problem is solved by using a modified Differential Evolution(DE) algorithm, which is characterized by new selection and mutation strategies. Meanwhile, a special penalty method is presented to tackle the multi-criteria constraints. Finally, numerical experiments for different optimization algorithms are implemented. The computation results show that the proposed indices of output transmission and torque, and constraint handling are effective for the redundant parallel robot; the modified DE algorithm is superior to the other tested algorithms, in terms of the ability of global search and the number of non-dominated solutions. The proposed methodology of multi-objective optimization can be also applied to the dimensional synthesis of other redundantly actuated parallel robots only with rotational movements.展开更多
In this paper, it is supposed that the B&B algorithm finds the first optimal solution after h nodes have been expanded and m active nodes have been created in the state-space tree. Then the lower bound Ω(m+h log ...In this paper, it is supposed that the B&B algorithm finds the first optimal solution after h nodes have been expanded and m active nodes have been created in the state-space tree. Then the lower bound Ω(m+h log h) of the running time for the general sequential B&B algorithm and the lower bound Ω(m/p+h log p) for the general parallel best-first B&B algorithm in PRAM-CREW are proposed, where p is the number of processors available. Moreover, the lower bound Ω(M/p+H+(H/p) log (H/p)) is presented for the parallel algorithms on distributed memory system, where M and H represent total number of the active nodes and that of the expanded nodes processed by p processors, respectively. In addition, a nearly fastest general parallel best-first B&B algorithm is put forward. The parallel algorithm is the fastest one as p = max{hε, r}, where ε = 1/ rootlogh, and r is the largest branch number of the nodes in the state-space tree.展开更多
A general and efficient parallel approach is proposed for the first time to parallelize the hybrid finiteelement-boundary-integral-multi-level fast multipole algorithm (FE-BI-MLFMA). Among many algorithms of FE-BI-M...A general and efficient parallel approach is proposed for the first time to parallelize the hybrid finiteelement-boundary-integral-multi-level fast multipole algorithm (FE-BI-MLFMA). Among many algorithms of FE-BI-MLFMA, the decomposition algorithm (DA) is chosen as a basis for the parallelization of FE-BI-MLFMA because of its distinct numerical characteristics suitable for parallelization. On the basis of the DA, the parallelization of FE-BI-MLFMA is carried out by employing the parallelized multi-frontal method for the matrix from the finiteelement method and the parallelized MLFMA for the matrix from the boundary integral method respectively. The programming and numerical experiments of the proposed parallel approach are carried out in the high perfor- mance computing platform CEMS-Liuhui. Numerical experiments demonstrate that FE-BI-MLFMA is efficiently parallelized and its computational capacity is greatly improved without losing accuracy, efficiency, and generality.展开更多
In this paper a class of real-time parallel modified Rosenbrock methods of numerical simulation is constructed for stiff dynamic systems on a multiprocessor system, and convergence and numerical stability of these met...In this paper a class of real-time parallel modified Rosenbrock methods of numerical simulation is constructed for stiff dynamic systems on a multiprocessor system, and convergence and numerical stability of these methods are discussed. A-stable real-time parallel formula of two-stage third-order and A(α)-stable real-time parallel formula with o ≈ 89.96° of three-stage fourth-order are particularly given. The numerical simulation experiments in parallel environment show that the class of algorithms is efficient and applicable, with greater speedup.展开更多
Spectrum sensing is the key and premise of cognitive radio( CR). Current parallel cooperative spectrum sensing strategies have some problems,such as large number of cooperative secondary users and lack of consideratio...Spectrum sensing is the key and premise of cognitive radio( CR). Current parallel cooperative spectrum sensing strategies have some problems,such as large number of cooperative secondary users and lack of consideration for the sensing overhead and the transmission gain. To solve those problems,an optimized parallel cooperative spectrum sensing strategy based on iterative KuhnMunkres( KM) algorithm was proposed. To maximize the total system profit,it considers the tradeoff between the sensing overhead and the transmission gain. Iterative KM algorithm was applied to obtaining the optimal assignment,which indicated when and which channels secondary users should sense. Furthermore,the required detection probability was introduced to avoid unnecessary waste when the accuracy met the system requirement. Monte Carlo simulations show that the proposed strategy can obtain higher total system profit with fewer cooperative secondary users.展开更多
In this paper, a parallel Surface Extraction from Binary Volumes with Higher-Order Smoothness (SEBVHOS) algorithm is proposed to accelerate the SEBVHOS execution. The original SEBVHOS algorithm is parallelized first, ...In this paper, a parallel Surface Extraction from Binary Volumes with Higher-Order Smoothness (SEBVHOS) algorithm is proposed to accelerate the SEBVHOS execution. The original SEBVHOS algorithm is parallelized first, and then several performance optimization techniques which are loop optimization, cache optimization, false sharing optimization, synchronization overhead op-timization, and thread affinity optimization, are used to improve the implementation's performance on multi-core systems. The performance of the parallel SEBVHOS algorithm is analyzed on a dual-core system. The experimental results show that the parallel SEBVHOS algorithm achieves an average of 1.86x speedup. More importantly, our method does not come with additional aliasing artifacts, com-paring to the original SEBVHOS algorithm.展开更多
In Additive Manufacturing field, the current researches of data processing mainly focus on a slicing process of large STL files or complicated CAD models. To improve the efficiency and reduce the slicing time, a paral...In Additive Manufacturing field, the current researches of data processing mainly focus on a slicing process of large STL files or complicated CAD models. To improve the efficiency and reduce the slicing time, a parallel algorithm has great advantages. However, traditional algorithms can't make full use of multi-core CPU hardware resources. In the paper, a fast parallel algorithm is presented to speed up data processing. A pipeline mode is adopted to design the parallel algorithm. And the complexity of the pipeline algorithm is analyzed theoretically. To evaluate the performance of the new algorithm, effects of threads number and layers number are investigated by a serial of experiments. The experimental results show that the threads number and layers number are two remarkable factors to the speedup ratio. The tendency of speedup versus threads number reveals a positive relationship which greatly agrees with the Amdahl's law, and the tendency of speedup versus layers number also keeps a positive relationship agreeing with Gustafson's law. The new algorithm uses topological information to compute contours with a parallel method of speedup. Another parallel algorithm based on data parallel is used in experiments to show that pipeline parallel mode is more efficient. A case study at last shows a suspending performance of the new parallel algorithm. Compared with the serial slicing algorithm, the new pipeline parallel algorithm can make full use of the multi-core CPU hardware, accelerate the slicing process, and compared with the data parallel slicing algorithm, the new slicing algorithm in this paper adopts a pipeline parallel model, and a much higher speedup ratio and efficiency is achieved.展开更多
A class of nonidentical parallel machine scheduling problems are considered in which the goal is to minimize the total weighted completion time. Models and relaxations are collected. Most of these problems are NP-hard...A class of nonidentical parallel machine scheduling problems are considered in which the goal is to minimize the total weighted completion time. Models and relaxations are collected. Most of these problems are NP-hard, in the strong sense, or open problems, therefore approximation algorithms are studied. The review reveals that there exist some potential areas worthy of further research.展开更多
Aimed at the problems of premature and lower convergence of simple genetic algorithms (SGA), three ideas --partition the whole search uniformly, multi-genetic operators and multi-populations evolving independently a...Aimed at the problems of premature and lower convergence of simple genetic algorithms (SGA), three ideas --partition the whole search uniformly, multi-genetic operators and multi-populations evolving independently are introduced, and a grid-based pseudo-parallel genetic algorithms (GPPGA) is put forward. Thereafter, the analysis of premature and convergence of GPPGA is made. In the end, GPPGA is tested by both six-peak camel back function, Rosenbrock function and BP network. The result shows the feasibility and effectiveness of GPPGA in overcoming premature and improving convergence speed and accuracy.展开更多
Genetic algorithm has been proposed to solve the problem of task assignment. However, it has some drawbacks, e.g., it often takes a long time to find an optimal solution, and the success rate is low. To overcome these...Genetic algorithm has been proposed to solve the problem of task assignment. However, it has some drawbacks, e.g., it often takes a long time to find an optimal solution, and the success rate is low. To overcome these problems, a new coarse grained parallel genetic algorithm with the scheme of central migration is presented, which exploits isolated sub populations. The new approach has been implemented in the PVM environment and has been evaluated on a workstation network for solving the task assignment problem. The results show that it not only significantly improves the result quality but also increases the speed for getting best solution.展开更多
In this paper,a two-dimensional(2 D)direction-of-arrival(DOA)estimation algorithm with increased degrees of freedom for two parallel linear arrays is presented.Being different from the conventional two-parallel linear...In this paper,a two-dimensional(2 D)direction-of-arrival(DOA)estimation algorithm with increased degrees of freedom for two parallel linear arrays is presented.Being different from the conventional two-parallel linear array,the proposed two-parallel linear array consists of two uniform linear arrays with non-equal inter-element spacing.Propagator method(PM)is used to obtain a special matrix which can be utilized to increase the virtual elements of one of uniform linear arrays.Then,the PM algorithm is used again to obtain automatically paired elevation and azimuth angles.The simulation results and complexity analysis show that the proposed method can increase the number of distinguishable signals and improve the estimation precision without increasing the computational complexity.展开更多
A scheduling model of closely spaced parallel runways for arrival aircraft was proposed,with multi-objections of the minimum flight delay cost,the maximum airport capacity,the minimum workload of air traffic controlle...A scheduling model of closely spaced parallel runways for arrival aircraft was proposed,with multi-objections of the minimum flight delay cost,the maximum airport capacity,the minimum workload of air traffic controller and the maximum fairness of airlines′scheduling.The time interval between two runways and changes of aircraft landing order were taken as the constraints.Genetic algorithm was used to solve the model,and the model constrained unit delay cost of the aircraft with multiple flight tasks to reduce its delay influence range.Each objective function value or the fitness of particle unsatisfied the constrain condition would be punished.Finally,one domestic airport hub was introduced to verify the algorithm and the model.The results showed that the genetic algorithm presented strong convergence and timeliness for solving constraint multi-objective aircraft landing problem on closely spaced parallel runways,and the optimization results were better than that of actual scheduling.展开更多
Considering premature convergence in the searching process of genetic algorithm, a chaotic migration-based pseudo parallel genetic algorithm (CMPPGA) is proposed, which applies the idea of isolated evolution and infor...Considering premature convergence in the searching process of genetic algorithm, a chaotic migration-based pseudo parallel genetic algorithm (CMPPGA) is proposed, which applies the idea of isolated evolution and information exchanging in distributed Parallel Genetic Algorithm by serial program structure to solve optimization problem of low real-time demand. In this algorithm, asynchronic migration of individuals during parallel evolution is guided by a chaotic migration sequence. Information exchanging among sub-populations is ensured to be efficient and sufficient due to that the sequence is ergodic and stochastic. Simulation study of CMPPGA shows its strong global search ability, superiority to standard genetic algorithm and high immunity against premature convergence. According to the practice of raw material supply, an inventory programming model is set up and solved by CMPPGA with satisfactory results returned.展开更多
Feature selection is one of the important topics in text classification. However, most of existing feature selection methods are serial and inefficient to be applied to massive text data sets. In this case, a feature ...Feature selection is one of the important topics in text classification. However, most of existing feature selection methods are serial and inefficient to be applied to massive text data sets. In this case, a feature selection method based on parallel collaborative evolutionary genetic algorithm is presented. The presented method uses genetic algorithm to select feature subsets and takes advantage of parallel collaborative evolution to enhance time efficiency, so it can quickly acquire the feature subsets which are more representative. The experimental results show that, for accuracy ratio and recall ratio, the presented method is better than information gain, x2 statistics, and mutual information methods; the consumed time of the presented method with only one CPU is inferior to that of these three methods, but the presented method is supe rior after using the parallel strategy.展开更多
Local mesh refinement is one of the key steps in the implementations of adaptive finite element methods.This paper presents a parallel algorithm for distributed memory parallel computers for adaptive local refinement ...Local mesh refinement is one of the key steps in the implementations of adaptive finite element methods.This paper presents a parallel algorithm for distributed memory parallel computers for adaptive local refinement of tetrahedral meshes using bisection.This algorithm is used in PHG,Parallel Hierarchical Grid (http://lsec.cc.ac.cn/phg/),a toolbox under active development for parallel adaptive finite element solutions of partial differential equations.The algorithm proposed is characterized by allowing simultaneous refinement of submeshes to arbitrary levels before synchronization between submeshes and without the need of a central coordinator process for managing new vertices.Using the concept of canonical refinement, a simple proof of the independence of the resulting mesh on the mesh partitioning is given,which is useful in better understanding the behaviour of the bisectioning refinement procedure.展开更多
文摘This study explores the application of parallel algorithms to enhance large-scale sorting, focusing on the QuickSort method. Implemented in both sequential and parallel forms, the paper provides a detailed comparison of their performance. This study investigates the efficacy of both techniques through the lens of array generation and pivot selection to manage datasets of varying sizes. This study meticulously documents the performance metrics, recording 16,499.2 milliseconds for the serial implementation and 16,339 milliseconds for the parallel implementation when sorting an array by using C++ chrono library. These results suggest that while the performance gains of the parallel approach over its serial counterpart are not immediately pronounced for smaller datasets, the benefits are expected to be more substantial as the dataset size increases.
基金funded by the NationalKey Research and Development Program of China under Grant No.11974373.
文摘The meta-heuristic algorithm is a global probabilistic search algorithm for the iterative solution.It has good performance in global optimization fields such as maximization.In this paper,a new adaptive parameter strategy and a parallel communication strategy are proposed to further improve the Cuckoo Search(CS)algorithm.This strategy greatly improves the convergence speed and accuracy of the algorithm and strengthens the algorithm’s ability to jump out of the local optimal.This paper compares the optimization performance of Parallel Adaptive Cuckoo Search(PACS)with CS,Parallel Cuckoo Search(PCS),Particle Swarm Optimization(PSO),Sine Cosine Algorithm(SCA),Grey Wolf Optimizer(GWO),Whale Optimization Algorithm(WOA),Differential Evolution(DE)and Artificial Bee Colony(ABC)algorithms by using the CEC-2013 test function.The results show that PACS algorithmoutperforms other algorithms in 20 of 28 test functions.Due to the superior performance of PACS algorithm,this paper uses it to solve the problem of the rectangular layout.Experimental results show that this scheme has a significant effect,and the material utilization rate is improved from89.5%to 97.8%after optimization.
文摘A distribution network plays an extremely important role in the safe and efficient operation of a power grid.As the core part of a power grid’s operation,a distribution network will have a significant impact on the safety and reliability of residential electricity consumption.it is necessary to actively plan and modify the distribution network’s structure in the power grid,improve the quality of the distribution network,and optimize the planning of the distribution network,so that the network can be fully utilized to meet the needs of electricity consumption.In this paper,a distribution network grid planning algorithm based on the reliability of electricity consumption was completed using ant colony algorithm.For the distribution network structure planning of dual power sources,the parallel ant colony algorithm was used to prove that the premise of parallelism is the interactive process of ant colonies,and the dual power distribution network structure model is established based on the principle of the lowest cost.The artificial ants in the algorithm were compared with real ants in nature,and the basic steps and working principle of the ant colony optimization algorithm was studied with the help of the travelling salesman problem(TSP).Then,the limitations of the ant colony algorithm were analyzed,and an improvement strategy was proposed by using python for digital simulation.The results demonstrated the reliability of model-building and algorithm improvement.
基金supported by the National Basic Research Program of China (973 Program) (61320)
文摘The method of establishing data structures plays an important role in the efficiency of parallel multilevel fast multipole algorithm(PMLFMA).Considering the main complements of multilevel fast multipole algorithm(MLFMA) memory,a new parallelization strategy and a modified data octree construction scheme are proposed to further reduce communication in order to improve parallel efficiency.For far interaction,a new scheme called dynamic memory allocation is developed.To analyze the workload balancing performance of a parallel implementation,the original concept of workload balancing factor is introduced and verified by numerical examples.Numerical results show that the above measures improve the parallel efficiency and are suitable for the analysis of electrical large-scale scattering objects.
基金This work was supported by the National Natural Science Foundation of China (No. 50375001)
文摘This paper considers adaptive control of parallel manipulators combined with fuzzy-neural network algorithms (FNNA). With this algorithm, the robustness is guaranteed by the adaptive control law and the parametric uncertainties are eliminated. FNNA is used to handle model uncertainties and external disturbances. In the proposed control scheme, we consider modifying the weight of fuzzy rules and present these rules to a MIMO system of parallel manipulators with more than three degrees-of-freedom (DoF). The algorithm has the advantage of not requiring the inverse of the Jacobian matrix especially for the low DoF parallel manipulators. The validity of the control scheme is shown through numerical simulations of a 6-RPS parallel manipulator with three DoF.
基金Supported by National Natural Science Foundation of China(Grant No.51175029)Beijing Municipal Natural Science Foundation of China(Grant No.3132019)
文摘Dimensional synthesis is one of the most difficult issues in the field of parallel robots with actuation redundancy. To deal with the optimal design of a redundantly actuated parallel robot used for ankle rehabilitation, a methodology of dimensional synthesis based on multi-objective optimization is presented. First, the dimensional synthesis of the redundant parallel robot is formulated as a nonlinear constrained multi-objective optimization problem. Then four objective functions, separately reflecting occupied space, input/output transmission and torque performances, and multi-criteria constraints, such as dimension, interference and kinematics, are defined. In consideration of the passive exercise of plantar/dorsiflexion requiring large output moment, a torque index is proposed. To cope with the actuation redundancy of the parallel robot, a new output transmission index is defined as well. The multi-objective optimization problem is solved by using a modified Differential Evolution(DE) algorithm, which is characterized by new selection and mutation strategies. Meanwhile, a special penalty method is presented to tackle the multi-criteria constraints. Finally, numerical experiments for different optimization algorithms are implemented. The computation results show that the proposed indices of output transmission and torque, and constraint handling are effective for the redundant parallel robot; the modified DE algorithm is superior to the other tested algorithms, in terms of the ability of global search and the number of non-dominated solutions. The proposed methodology of multi-objective optimization can be also applied to the dimensional synthesis of other redundantly actuated parallel robots only with rotational movements.
基金This paper was supported by Ph. D. Foundation of State Education Commission of China.
文摘In this paper, it is supposed that the B&B algorithm finds the first optimal solution after h nodes have been expanded and m active nodes have been created in the state-space tree. Then the lower bound Ω(m+h log h) of the running time for the general sequential B&B algorithm and the lower bound Ω(m/p+h log p) for the general parallel best-first B&B algorithm in PRAM-CREW are proposed, where p is the number of processors available. Moreover, the lower bound Ω(M/p+H+(H/p) log (H/p)) is presented for the parallel algorithms on distributed memory system, where M and H represent total number of the active nodes and that of the expanded nodes processed by p processors, respectively. In addition, a nearly fastest general parallel best-first B&B algorithm is put forward. The parallel algorithm is the fastest one as p = max{hε, r}, where ε = 1/ rootlogh, and r is the largest branch number of the nodes in the state-space tree.
文摘A general and efficient parallel approach is proposed for the first time to parallelize the hybrid finiteelement-boundary-integral-multi-level fast multipole algorithm (FE-BI-MLFMA). Among many algorithms of FE-BI-MLFMA, the decomposition algorithm (DA) is chosen as a basis for the parallelization of FE-BI-MLFMA because of its distinct numerical characteristics suitable for parallelization. On the basis of the DA, the parallelization of FE-BI-MLFMA is carried out by employing the parallelized multi-frontal method for the matrix from the finiteelement method and the parallelized MLFMA for the matrix from the boundary integral method respectively. The programming and numerical experiments of the proposed parallel approach are carried out in the high perfor- mance computing platform CEMS-Liuhui. Numerical experiments demonstrate that FE-BI-MLFMA is efficiently parallelized and its computational capacity is greatly improved without losing accuracy, efficiency, and generality.
基金This project was supported by the National Natural Science Foundation of China (No. 19871080).
文摘In this paper a class of real-time parallel modified Rosenbrock methods of numerical simulation is constructed for stiff dynamic systems on a multiprocessor system, and convergence and numerical stability of these methods are discussed. A-stable real-time parallel formula of two-stage third-order and A(α)-stable real-time parallel formula with o ≈ 89.96° of three-stage fourth-order are particularly given. The numerical simulation experiments in parallel environment show that the class of algorithms is efficient and applicable, with greater speedup.
基金Young Scientists Fund of the National Natural Science Foundation of China(No.61101141)Fundamental Research Funds for the Central Universities of China(No.HEUCF130807)Heilongjiang Province Natural Science Foundation for the Youth,China(No.QC2012C070/F010106)
文摘Spectrum sensing is the key and premise of cognitive radio( CR). Current parallel cooperative spectrum sensing strategies have some problems,such as large number of cooperative secondary users and lack of consideration for the sensing overhead and the transmission gain. To solve those problems,an optimized parallel cooperative spectrum sensing strategy based on iterative KuhnMunkres( KM) algorithm was proposed. To maximize the total system profit,it considers the tradeoff between the sensing overhead and the transmission gain. Iterative KM algorithm was applied to obtaining the optimal assignment,which indicated when and which channels secondary users should sense. Furthermore,the required detection probability was introduced to avoid unnecessary waste when the accuracy met the system requirement. Monte Carlo simulations show that the proposed strategy can obtain higher total system profit with fewer cooperative secondary users.
基金Supported by the National Natural Science Foundation of China(No.61071173)
文摘In this paper, a parallel Surface Extraction from Binary Volumes with Higher-Order Smoothness (SEBVHOS) algorithm is proposed to accelerate the SEBVHOS execution. The original SEBVHOS algorithm is parallelized first, and then several performance optimization techniques which are loop optimization, cache optimization, false sharing optimization, synchronization overhead op-timization, and thread affinity optimization, are used to improve the implementation's performance on multi-core systems. The performance of the parallel SEBVHOS algorithm is analyzed on a dual-core system. The experimental results show that the parallel SEBVHOS algorithm achieves an average of 1.86x speedup. More importantly, our method does not come with additional aliasing artifacts, com-paring to the original SEBVHOS algorithm.
文摘In Additive Manufacturing field, the current researches of data processing mainly focus on a slicing process of large STL files or complicated CAD models. To improve the efficiency and reduce the slicing time, a parallel algorithm has great advantages. However, traditional algorithms can't make full use of multi-core CPU hardware resources. In the paper, a fast parallel algorithm is presented to speed up data processing. A pipeline mode is adopted to design the parallel algorithm. And the complexity of the pipeline algorithm is analyzed theoretically. To evaluate the performance of the new algorithm, effects of threads number and layers number are investigated by a serial of experiments. The experimental results show that the threads number and layers number are two remarkable factors to the speedup ratio. The tendency of speedup versus threads number reveals a positive relationship which greatly agrees with the Amdahl's law, and the tendency of speedup versus layers number also keeps a positive relationship agreeing with Gustafson's law. The new algorithm uses topological information to compute contours with a parallel method of speedup. Another parallel algorithm based on data parallel is used in experiments to show that pipeline parallel mode is more efficient. A case study at last shows a suspending performance of the new parallel algorithm. Compared with the serial slicing algorithm, the new pipeline parallel algorithm can make full use of the multi-core CPU hardware, accelerate the slicing process, and compared with the data parallel slicing algorithm, the new slicing algorithm in this paper adopts a pipeline parallel model, and a much higher speedup ratio and efficiency is achieved.
基金the National Natural Science Foundation of China (70631003)the Hefei University of Technology Foundation (071102F).
文摘A class of nonidentical parallel machine scheduling problems are considered in which the goal is to minimize the total weighted completion time. Models and relaxations are collected. Most of these problems are NP-hard, in the strong sense, or open problems, therefore approximation algorithms are studied. The review reveals that there exist some potential areas worthy of further research.
文摘Aimed at the problems of premature and lower convergence of simple genetic algorithms (SGA), three ideas --partition the whole search uniformly, multi-genetic operators and multi-populations evolving independently are introduced, and a grid-based pseudo-parallel genetic algorithms (GPPGA) is put forward. Thereafter, the analysis of premature and convergence of GPPGA is made. In the end, GPPGA is tested by both six-peak camel back function, Rosenbrock function and BP network. The result shows the feasibility and effectiveness of GPPGA in overcoming premature and improving convergence speed and accuracy.
基金Supported by the Nation"86 3"Hi-Tech Development Program of China(86 3-30 6 -ZD11-0 1-8)
文摘Genetic algorithm has been proposed to solve the problem of task assignment. However, it has some drawbacks, e.g., it often takes a long time to find an optimal solution, and the success rate is low. To overcome these problems, a new coarse grained parallel genetic algorithm with the scheme of central migration is presented, which exploits isolated sub populations. The new approach has been implemented in the PVM environment and has been evaluated on a workstation network for solving the task assignment problem. The results show that it not only significantly improves the result quality but also increases the speed for getting best solution.
基金supported by the National Natural Science Foundation of China(51877015,U1831117)the Cooperation Agreement Foundation by the Department of Science and Technology of Guizhou Province of China(LH[2017]7320,LH[2017]7321,[2015]7249)+2 种基金the Innovation Group Major Research Program Funded by Guizhou Provincial Education Department(KY[2016]051)the Foundation of Top-notch Talents by Education Department of Guizhou Province of China(KY[2018]075)PhD Research Startup Foundation of Tongren University(trxy DH1710)。
文摘In this paper,a two-dimensional(2 D)direction-of-arrival(DOA)estimation algorithm with increased degrees of freedom for two parallel linear arrays is presented.Being different from the conventional two-parallel linear array,the proposed two-parallel linear array consists of two uniform linear arrays with non-equal inter-element spacing.Propagator method(PM)is used to obtain a special matrix which can be utilized to increase the virtual elements of one of uniform linear arrays.Then,the PM algorithm is used again to obtain automatically paired elevation and azimuth angles.The simulation results and complexity analysis show that the proposed method can increase the number of distinguishable signals and improve the estimation precision without increasing the computational complexity.
文摘A scheduling model of closely spaced parallel runways for arrival aircraft was proposed,with multi-objections of the minimum flight delay cost,the maximum airport capacity,the minimum workload of air traffic controller and the maximum fairness of airlines′scheduling.The time interval between two runways and changes of aircraft landing order were taken as the constraints.Genetic algorithm was used to solve the model,and the model constrained unit delay cost of the aircraft with multiple flight tasks to reduce its delay influence range.Each objective function value or the fitness of particle unsatisfied the constrain condition would be punished.Finally,one domestic airport hub was introduced to verify the algorithm and the model.The results showed that the genetic algorithm presented strong convergence and timeliness for solving constraint multi-objective aircraft landing problem on closely spaced parallel runways,and the optimization results were better than that of actual scheduling.
文摘Considering premature convergence in the searching process of genetic algorithm, a chaotic migration-based pseudo parallel genetic algorithm (CMPPGA) is proposed, which applies the idea of isolated evolution and information exchanging in distributed Parallel Genetic Algorithm by serial program structure to solve optimization problem of low real-time demand. In this algorithm, asynchronic migration of individuals during parallel evolution is guided by a chaotic migration sequence. Information exchanging among sub-populations is ensured to be efficient and sufficient due to that the sequence is ergodic and stochastic. Simulation study of CMPPGA shows its strong global search ability, superiority to standard genetic algorithm and high immunity against premature convergence. According to the practice of raw material supply, an inventory programming model is set up and solved by CMPPGA with satisfactory results returned.
基金supported by the Science and Technology Plan Projects of Sichuan Province of China under Grant No.2008GZ0003the Key Technologies R & D Program of Sichuan Province of China under Grant No.2008SZ0100
文摘Feature selection is one of the important topics in text classification. However, most of existing feature selection methods are serial and inefficient to be applied to massive text data sets. In this case, a feature selection method based on parallel collaborative evolutionary genetic algorithm is presented. The presented method uses genetic algorithm to select feature subsets and takes advantage of parallel collaborative evolution to enhance time efficiency, so it can quickly acquire the feature subsets which are more representative. The experimental results show that, for accuracy ratio and recall ratio, the presented method is better than information gain, x2 statistics, and mutual information methods; the consumed time of the presented method with only one CPU is inferior to that of these three methods, but the presented method is supe rior after using the parallel strategy.
基金supported by the 973 Program of China 2005CB321702China NSF 10531080.
文摘Local mesh refinement is one of the key steps in the implementations of adaptive finite element methods.This paper presents a parallel algorithm for distributed memory parallel computers for adaptive local refinement of tetrahedral meshes using bisection.This algorithm is used in PHG,Parallel Hierarchical Grid (http://lsec.cc.ac.cn/phg/),a toolbox under active development for parallel adaptive finite element solutions of partial differential equations.The algorithm proposed is characterized by allowing simultaneous refinement of submeshes to arbitrary levels before synchronization between submeshes and without the need of a central coordinator process for managing new vertices.Using the concept of canonical refinement, a simple proof of the independence of the resulting mesh on the mesh partitioning is given,which is useful in better understanding the behaviour of the bisectioning refinement procedure.