Real-time capabilities and computational efficiency are provided by parallel image processing utilizing OpenMP. However, race conditions can affect the accuracy and reliability of the outcomes. This paper highlights t...Real-time capabilities and computational efficiency are provided by parallel image processing utilizing OpenMP. However, race conditions can affect the accuracy and reliability of the outcomes. This paper highlights the importance of addressing race conditions in parallel image processing, specifically focusing on color inverse filtering using OpenMP. We considered three solutions to solve race conditions, each with distinct characteristics: #pragma omp atomic: Protects individual memory operations for fine-grained control. #pragma omp critical: Protects entire code blocks for exclusive access. #pragma omp parallel sections reduction: Employs a reduction clause for safe aggregation of values across threads. Our findings show that the produced images were unaffected by race condition. However, it becomes evident that solving the race conditions in the code makes it significantly faster, especially when it is executed on multiple cores.展开更多
In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularl...In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularly noteworthy in the field of image processing, which witnessed significant advancements. This parallel computing project explored the field of parallel image processing, with a focus on the grayscale conversion of colorful images. Our approach involved integrating OpenMP into our framework for parallelization to execute a critical image processing task: grayscale conversion. By using OpenMP, we strategically enhanced the overall performance of the conversion process by distributing the workload across multiple threads. The primary objectives of our project revolved around optimizing computation time and improving overall efficiency, particularly in the task of grayscale conversion of colorful images. Utilizing OpenMP for concurrent processing across multiple cores significantly reduced execution times through the effective distribution of tasks among these cores. The speedup values for various image sizes highlighted the efficacy of parallel processing, especially for large images. However, a detailed examination revealed a potential decline in parallelization efficiency with an increasing number of cores. This underscored the importance of a carefully optimized parallelization strategy, considering factors like load balancing and minimizing communication overhead. Despite challenges, the overall scalability and efficiency achieved with parallel image processing underscored OpenMP’s effectiveness in accelerating image manipulation tasks.展开更多
The strict and high-standard requirements for the safety and stability ofmajor engineering systems make it a tough challenge for large-scale finite element modal analysis.At the same time,realizing the systematic anal...The strict and high-standard requirements for the safety and stability ofmajor engineering systems make it a tough challenge for large-scale finite element modal analysis.At the same time,realizing the systematic analysis of the entire large structure of these engineering systems is extremely meaningful in practice.This article proposes a multilevel hierarchical parallel algorithm for large-scale finite element modal analysis to reduce the parallel computational efficiency loss when using heterogeneous multicore distributed storage computers in solving large-scale finite element modal analysis.Based on two-level partitioning and four-transformation strategies,the proposed algorithm not only improves the memory access rate through the sparsely distributed storage of a large amount of data but also reduces the solution time by reducing the scale of the generalized characteristic equation(GCEs).Moreover,a multilevel hierarchical parallelization approach is introduced during the computational procedure to enable the separation of the communication of inter-nodes,intra-nodes,heterogeneous core groups(HCGs),and inside HCGs through mapping computing tasks to various hardware layers.This method can efficiently achieve load balancing at different layers and significantly improve the communication rate through hierarchical communication.Therefore,it can enhance the efficiency of parallel computing of large-scale finite element modal analysis by fully exploiting the architecture characteristics of heterogeneous multicore clusters.Finally,typical numerical experiments were used to validate the correctness and efficiency of the proposedmethod.Then a parallel modal analysis example of the cross-river tunnel with over ten million degrees of freedom(DOFs)was performed,and ten-thousand core processors were applied to verify the feasibility of the algorithm.展开更多
Using the method of mathematical morphology,this paper fulfills filtration,segmentation and extraction of morphological features of the satellite cloud image.It also gives out the relative algorithms,which is realized...Using the method of mathematical morphology,this paper fulfills filtration,segmentation and extraction of morphological features of the satellite cloud image.It also gives out the relative algorithms,which is realized by parallel C programming based on Transputer networks.It has been successfully used to process the typhoon and the low tornado cloud image.And it will be used in weather forecast.展开更多
The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is present...The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is presented. It has many promising characteristics such as powerful computing capability, broad I/O bandwidth, topology flexibility, and expansibility. The parallel system performance is evaluated by practical experiment.展开更多
In the procedure of the steady-state hierarchical optimization with feedback for large-scale industrial processes, a sequence of set-point changes with different magnitudes is carried out on the optimization layer. To...In the procedure of the steady-state hierarchical optimization with feedback for large-scale industrial processes, a sequence of set-point changes with different magnitudes is carried out on the optimization layer. To improve the dynamic performance of transient response driven by the set-point changes, a filter-based iterative learning control strategy is proposed. In the proposed updating law, a local-symmetric-integral operator is adopted for eliminating the measurement noise of output information,a set of desired trajectories are specified according to the set-point changes sequence, the current control input is iteratively achieved by utilizing smoothed output error to modify its control input at previous iteration, to which the amplified coefficients related to the different magnitudes of set-point changes are introduced. The convergence of the algorithm is conducted by incorporating frequency-domain technique into time-domain analysis. Numerical simulation demonstrates the effectiveness of the proposed strategy,展开更多
Withthe rapiddevelopment of deep learning,the size of data sets anddeepneuralnetworks(DNNs)models are also booming.As a result,the intolerable long time for models’training or inference with conventional strategies c...Withthe rapiddevelopment of deep learning,the size of data sets anddeepneuralnetworks(DNNs)models are also booming.As a result,the intolerable long time for models’training or inference with conventional strategies can not meet the satisfaction of modern tasks gradually.Moreover,devices stay idle in the scenario of edge computing(EC),which presents a waste of resources since they can share the pressure of the busy devices but they do not.To address the problem,the strategy leveraging distributed processing has been applied to load computation tasks from a single processor to a group of devices,which results in the acceleration of training or inference of DNN models and promotes the high utilization of devices in edge computing.Compared with existing papers,this paper presents an enlightening and novel review of applying distributed processing with data and model parallelism to improve deep learning tasks in edge computing.Considering the practicalities,commonly used lightweight models in a distributed system are introduced as well.As the key technique,the parallel strategy will be described in detail.Then some typical applications of distributed processing will be analyzed.Finally,the challenges of distributed processing with edge computing will be described.展开更多
Suspicious mass traffic constantly evolves,making network behaviour tracing and structure more complex.Neural networks yield promising results by considering a sufficient number of processing elements with strong inte...Suspicious mass traffic constantly evolves,making network behaviour tracing and structure more complex.Neural networks yield promising results by considering a sufficient number of processing elements with strong interconnections between them.They offer efficient computational Hopfield neural networks models and optimization constraints used by undergoing a good amount of parallelism to yield optimal results.Artificial neural network(ANN)offers optimal solutions in classifying and clustering the various reels of data,and the results obtained purely depend on identifying a problem.In this research work,the design of optimized applications is presented in an organized manner.In addition,this research work examines theoretical approaches to achieving optimized results using ANN.It mainly focuses on designing rules.The optimizing design approach of neural networks analyzes the internal process of the neural networks.Practices in developing the network are based on the interconnections among the hidden nodes and their learning parameters.The methodology is proven best for nonlinear resource allocation problems with a suitable design and complex issues.The ANN proposed here considers more or less 46k nodes hidden inside 49 million connections employed on full-fledged parallel processors.The proposed ANN offered optimal results in real-world application problems,and the results were obtained using MATLAB.展开更多
To improve image processing speed and detection precision of a surface detection system on a strip surface,based on the analysis of the characteristics of image data and image processing in detection system on the str...To improve image processing speed and detection precision of a surface detection system on a strip surface,based on the analysis of the characteristics of image data and image processing in detection system on the strip surface,the design of parallel image processing system and the methods of algorithm implementation have been studied. By using field programmable gate array(FPGA) as hardware platform of implementation and considering the characteristic of detection system on the strip surface,a parallel image processing system implemented by using multi IP kernel is designed. According to different computing tasks and the load balancing capability of parallel processing system,the system could set different calculating numbers of nodes to meet the system's demand and save the hardware cost.展开更多
Porous materials present significant advantages for absorbing radioactive isotopes in nuclear waste streams.To improve absorption efficiency in nuclear waste treatment,a thorough understanding of the diffusion-advecti...Porous materials present significant advantages for absorbing radioactive isotopes in nuclear waste streams.To improve absorption efficiency in nuclear waste treatment,a thorough understanding of the diffusion-advection process within porous structures is essential for material design.In this study,we present advancements in the volumetric lattice Boltzmann method(VLBM)for modeling and simulating pore-scale diffusion-advection of radioactive isotopes within geopolymer porous structures.These structures are created using the phase field method(PFM)to precisely control pore architectures.In our VLBM approach,we introduce a concentration field of an isotope seamlessly coupled with the velocity field and solve it by the time evolution of its particle population function.To address the computational intensity inherent in the coupled lattice Boltzmann equations for velocity and concentration fields,we implement graphics processing unit(GPU)parallelization.Validation of the developed model involves examining the flow and diffusion fields in porous structures.Remarkably,good agreement is observed for both the velocity field from VLBM and multiphysics object-oriented simulation environment(MOOSE),and the concentration field from VLBM and the finite difference method(FDM).Furthermore,we investigate the effects of background flow,species diffusivity,and porosity on the diffusion-advection behavior by varying the background flow velocity,diffusion coefficient,and pore volume fraction,respectively.Notably,all three parameters exert an influence on the diffusion-advection process.Increased background flow and diffusivity markedly accelerate the process due to increased advection intensity and enhanced diffusion capability,respectively.Conversely,increasing the porosity has a less significant effect,causing a slight slowdown of the diffusion-advection process due to the expanded pore volume.This comprehensive parametric study provides valuable insights into the kinetics of isotope uptake in porous structures,facilitating the development of porous materials for nuclear waste treatment applications.展开更多
An improvement detecting method was proposed according to the disadvantages of testing method of optical axes parallelism of shipboard photoelectrical theodolite (short for theodolite) based on image processing. Point...An improvement detecting method was proposed according to the disadvantages of testing method of optical axes parallelism of shipboard photoelectrical theodolite (short for theodolite) based on image processing. Pointolite replaced 0.2'' collimator to reduce the errors of crosshair images processing and improve the quality of image. What’s more, the high quality images could help to optimize the image processing method and the testing accuracy. The errors between the trial results interpreted by software and the results tested in dock were less than 10'', which indicated the improve method had some actual application values.展开更多
The parallel processing based on the free running model test was adopted to predict the interaction force coefficients (flow straightening coefficient and wake fraction) of ship maneuvering. And the multipopulation ...The parallel processing based on the free running model test was adopted to predict the interaction force coefficients (flow straightening coefficient and wake fraction) of ship maneuvering. And the multipopulation genetic algorithm (MPGA) based on real coding that can contemporarily process the data of free running model and simulation of ship maneuvering was applied to solve the problem. Accordingly the optimal individual was obtained using the method of genetic algorithm. The parallel processing of multiopulation solved the prematurity in the identification for single population, meanwhile, the parallel processing of the data of ship maneuvering (turning motion and zigzag motion) is an attempt to solve the coefficient drift problem. In order to validate the method, the interaction force coefficients were verified by the procedure and these coefficients measured were compared with those ones identified. The maximum error is less than 5%, and the identification is an effective method.展开更多
Along with the increasing Big Data challenges, the MapReduce based systems are extensively welcomed, because of their remarkable simplicity and scalability. However, from the first day MapReduce is proposed, its a...Along with the increasing Big Data challenges, the MapReduce based systems are extensively welcomed, because of their remarkable simplicity and scalability. However, from the first day MapReduce is proposed, its argument with parallel Dt3MSs never stops, as it over-focuses on the scalability but overlooks the efficiency. Accordingly, extended systems are proposed in order to improve the peDbrmance on the limited scale clusters. In the meantime, traditional RDBMS technologies like structured data model, transaction, SQL, etc. are also getting more attention. This paper reviews such systems, from Google and also the third parties, trying to indicate the directions for the future research.展开更多
Density-based algorithm for discovering clusters in large spatial databases with noise(DBSCAN) is a classic kind of density-based spatial clustering algorithm and is widely applied in several aspects due to good perfo...Density-based algorithm for discovering clusters in large spatial databases with noise(DBSCAN) is a classic kind of density-based spatial clustering algorithm and is widely applied in several aspects due to good performance in capturing arbitrary shapes and detecting outliers. However, in practice, datasets are always too massive to fit the serial DBSCAN. And a new parallel algorithm-Parallel DBSCAN(PDBSCAN) was proposed to solve the problem which DBSCAN faced. The proposed parallel algorithm bases on MapReduce mechanism. The usage of parallel mechanism in the algorithm focuses on region query and candidate queue processing which needed substantive computation resources. As a result, PDBSCAN is scalable for large-scale dataset clustering and is extremely suitable for applications in E-Commence, especially for recommendation.展开更多
Large range cell migration is a severe challenge to imaging algorithm for spaceborne SAR. Based on design of Finite Impulse Response (FIR) filter and Range Doppler (RD) algorithm, a realization of quick-look imaging f...Large range cell migration is a severe challenge to imaging algorithm for spaceborne SAR. Based on design of Finite Impulse Response (FIR) filter and Range Doppler (RD) algorithm, a realization of quick-look imaging for large range cell migration is proposed. It realized quick-look imaging of 8 times reduced resolution with parallel processing on memory shared 8 CPU SGI server. According to simulation experiment, this quick-look imaging algorithm with parallel processing can image 16384x16384 SAR raw data within 6 seconds. It reaches the requirement of real-time imaging.展开更多
In this paper, according to the parallel environment of ELXSI computer, a parallel solving process of substructure method in static and dynamic analyses of large-scale and complex structure has been put forward, and t...In this paper, according to the parallel environment of ELXSI computer, a parallel solving process of substructure method in static and dynamic analyses of large-scale and complex structure has been put forward, and the corresponding parallel computational program has been developed.展开更多
On-line transient stability analysis of a power grid is crucial in determining whether the power grid will traverse to a steady state stable operating point after a disturbance. The transient stability analysis involv...On-line transient stability analysis of a power grid is crucial in determining whether the power grid will traverse to a steady state stable operating point after a disturbance. The transient stability analysis involves computing the solutions of the algebraic equations modeling the grid network and the ordinary differential equations modeling the dynamics of the electrical components like synchronous generators, exciters, governors, etc., of the grid in near real-time. In this research, we investigate the use of time-parallel approach in particular the Parareal algorithm implementation on Graphical Processing Unit using Compute Unified Device Architecture to compute solutions of ordinary differential equations. The numerical solution accuracy and computation time of the Parareal algorithm executing on the GPU are demonstrated on the single machine infinite bus test system. Two types of dynamic model of the single synchronous generator namely the classical and detailed models are studied. The numerical solutions of the ordinary differential equations computed by the Parareal algorithm are compared to that computed using the modified Euler’s method demonstrating the accuracy of the Parareal algorithm executing on GPU. Simulations are performed with varying numerical integration time steps, and the suitability of Parareal algorithm in computing near real-time solutions of ordinary different equations is presented. A speedup of 25× and 31× is achieved with the Parareal algorithm for classical and detailed dynamic models of the synchronous generator respectively compared to the sequential modified Euler’s method. The weak scaling efficiency of the Parareal algorithm when required to solve a large number of ordinary differential equations at each time step due to the increase in sequential computations and associated memory transfer latency between the CPU and GPU is discussed.展开更多
In order to improve femtosecond laser throughput,a parallel processing system consisting of liquid crystal on silicon(LCOS)device as spatial light modulator is put forward.A method is described for displaying Fourier ...In order to improve femtosecond laser throughput,a parallel processing system consisting of liquid crystal on silicon(LCOS)device as spatial light modulator is put forward.A method is described for displaying Fourier hologram on LCOS,and a high uniformity of several diffraction peaks in the computer reconstruction is achieved.Application of this method to the parallel femtosecond laser processing is also demonstrated,and two intersecting rings and three tangent rings are fabricated respectively by one time in the photoresist.展开更多
Processing large-scale 3-D gravity data is an important topic in geophysics field. Many existing inversion methods lack the competence of processing massive data and practical application capacity. This study proposes...Processing large-scale 3-D gravity data is an important topic in geophysics field. Many existing inversion methods lack the competence of processing massive data and practical application capacity. This study proposes the application of GPU parallel processing technology to the focusing inversion method, aiming at improving the inversion accuracy while speeding up calculation and reducing the memory consumption, thus obtaining the fast and reliable inversion results for large complex model. In this paper, equivalent storage of geometric trellis is used to calculate the sensitivity matrix, and the inversion is based on GPU parallel computing technology. The parallel computing program that is optimized by reducing data transfer, access restrictions and instruction restrictions as well as latency hiding greatly reduces the memory usage, speeds up the calculation, and makes the fast inversion of large models possible. By comparing and analyzing the computing speed of traditional single thread CPU method and CUDA-based GPU parallel technology, the excellent acceleration performance of GPU parallel computing is verified, which provides ideas for practical application of some theoretical inversion methods restricted by computing speed and computer memory. The model test verifies that the focusing inversion method can overcome the problem of severe skin effect and ambiguity of geological body boundary. Moreover, the increase of the model cells and inversion data can more clearly depict the boundary position of the abnormal body and delineate its specific shape.展开更多
This paper deals with a parallel processing uninterruptible power supply (UPS) for sudden voltage fluctuation in power management to integrate power quality improvement, load voltage stabilization and UPS. To reduce t...This paper deals with a parallel processing uninterruptible power supply (UPS) for sudden voltage fluctuation in power management to integrate power quality improvement, load voltage stabilization and UPS. To reduce the complexity, cost and number of power conversions, which results in higher efficiency, only one voltage-controlled voltage source inverter (VCVSI) is used. The VCVSI is connected in series on the DC battery side and in parallel on the AC grid side with a decoupling inductor. The system provides sinusoidal voltage at the fundamental value of 220V/60Hz for the load during abnormal utility power conditions or grid failure. Also, the system can be operated to mitigate the harmonic current and voltage demand from nonlinear loads and provide voltage stabilization for loads when sudden voltage fluctuation occur, such as sag and swell. The experimental results confirm the system protects against outages caused by abnormal utility power conditions and sudden voltage fluctuations and change.展开更多
文摘Real-time capabilities and computational efficiency are provided by parallel image processing utilizing OpenMP. However, race conditions can affect the accuracy and reliability of the outcomes. This paper highlights the importance of addressing race conditions in parallel image processing, specifically focusing on color inverse filtering using OpenMP. We considered three solutions to solve race conditions, each with distinct characteristics: #pragma omp atomic: Protects individual memory operations for fine-grained control. #pragma omp critical: Protects entire code blocks for exclusive access. #pragma omp parallel sections reduction: Employs a reduction clause for safe aggregation of values across threads. Our findings show that the produced images were unaffected by race condition. However, it becomes evident that solving the race conditions in the code makes it significantly faster, especially when it is executed on multiple cores.
文摘In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularly noteworthy in the field of image processing, which witnessed significant advancements. This parallel computing project explored the field of parallel image processing, with a focus on the grayscale conversion of colorful images. Our approach involved integrating OpenMP into our framework for parallelization to execute a critical image processing task: grayscale conversion. By using OpenMP, we strategically enhanced the overall performance of the conversion process by distributing the workload across multiple threads. The primary objectives of our project revolved around optimizing computation time and improving overall efficiency, particularly in the task of grayscale conversion of colorful images. Utilizing OpenMP for concurrent processing across multiple cores significantly reduced execution times through the effective distribution of tasks among these cores. The speedup values for various image sizes highlighted the efficacy of parallel processing, especially for large images. However, a detailed examination revealed a potential decline in parallelization efficiency with an increasing number of cores. This underscored the importance of a carefully optimized parallelization strategy, considering factors like load balancing and minimizing communication overhead. Despite challenges, the overall scalability and efficiency achieved with parallel image processing underscored OpenMP’s effectiveness in accelerating image manipulation tasks.
基金supported by the National Natural Science Foundation of China(Grant No.11772192).
文摘The strict and high-standard requirements for the safety and stability ofmajor engineering systems make it a tough challenge for large-scale finite element modal analysis.At the same time,realizing the systematic analysis of the entire large structure of these engineering systems is extremely meaningful in practice.This article proposes a multilevel hierarchical parallel algorithm for large-scale finite element modal analysis to reduce the parallel computational efficiency loss when using heterogeneous multicore distributed storage computers in solving large-scale finite element modal analysis.Based on two-level partitioning and four-transformation strategies,the proposed algorithm not only improves the memory access rate through the sparsely distributed storage of a large amount of data but also reduces the solution time by reducing the scale of the generalized characteristic equation(GCEs).Moreover,a multilevel hierarchical parallelization approach is introduced during the computational procedure to enable the separation of the communication of inter-nodes,intra-nodes,heterogeneous core groups(HCGs),and inside HCGs through mapping computing tasks to various hardware layers.This method can efficiently achieve load balancing at different layers and significantly improve the communication rate through hierarchical communication.Therefore,it can enhance the efficiency of parallel computing of large-scale finite element modal analysis by fully exploiting the architecture characteristics of heterogeneous multicore clusters.Finally,typical numerical experiments were used to validate the correctness and efficiency of the proposedmethod.Then a parallel modal analysis example of the cross-river tunnel with over ten million degrees of freedom(DOFs)was performed,and ten-thousand core processors were applied to verify the feasibility of the algorithm.
文摘Using the method of mathematical morphology,this paper fulfills filtration,segmentation and extraction of morphological features of the satellite cloud image.It also gives out the relative algorithms,which is realized by parallel C programming based on Transputer networks.It has been successfully used to process the typhoon and the low tornado cloud image.And it will be used in weather forecast.
基金This project was supported by the National Natural Science Foundation of China (60135020).
文摘The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is presented. It has many promising characteristics such as powerful computing capability, broad I/O bandwidth, topology flexibility, and expansibility. The parallel system performance is evaluated by practical experiment.
基金This work was supported by the National Natural Science Foundation of China (No. 60274055)
文摘In the procedure of the steady-state hierarchical optimization with feedback for large-scale industrial processes, a sequence of set-point changes with different magnitudes is carried out on the optimization layer. To improve the dynamic performance of transient response driven by the set-point changes, a filter-based iterative learning control strategy is proposed. In the proposed updating law, a local-symmetric-integral operator is adopted for eliminating the measurement noise of output information,a set of desired trajectories are specified according to the set-point changes sequence, the current control input is iteratively achieved by utilizing smoothed output error to modify its control input at previous iteration, to which the amplified coefficients related to the different magnitudes of set-point changes are introduced. The convergence of the algorithm is conducted by incorporating frequency-domain technique into time-domain analysis. Numerical simulation demonstrates the effectiveness of the proposed strategy,
基金supported by the Natural Science Foundation of Jiangsu Province of China under Grant No.BK20211284the Financial and Science Technology Plan Project of Xinjiang Production,Construction Corps under Grant No.2020DB005the National Natural Science Foundation of China under Grant Nos.61872219,62002276 and 62177014。
文摘Withthe rapiddevelopment of deep learning,the size of data sets anddeepneuralnetworks(DNNs)models are also booming.As a result,the intolerable long time for models’training or inference with conventional strategies can not meet the satisfaction of modern tasks gradually.Moreover,devices stay idle in the scenario of edge computing(EC),which presents a waste of resources since they can share the pressure of the busy devices but they do not.To address the problem,the strategy leveraging distributed processing has been applied to load computation tasks from a single processor to a group of devices,which results in the acceleration of training or inference of DNN models and promotes the high utilization of devices in edge computing.Compared with existing papers,this paper presents an enlightening and novel review of applying distributed processing with data and model parallelism to improve deep learning tasks in edge computing.Considering the practicalities,commonly used lightweight models in a distributed system are introduced as well.As the key technique,the parallel strategy will be described in detail.Then some typical applications of distributed processing will be analyzed.Finally,the challenges of distributed processing with edge computing will be described.
基金This research is funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R 151)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Suspicious mass traffic constantly evolves,making network behaviour tracing and structure more complex.Neural networks yield promising results by considering a sufficient number of processing elements with strong interconnections between them.They offer efficient computational Hopfield neural networks models and optimization constraints used by undergoing a good amount of parallelism to yield optimal results.Artificial neural network(ANN)offers optimal solutions in classifying and clustering the various reels of data,and the results obtained purely depend on identifying a problem.In this research work,the design of optimized applications is presented in an organized manner.In addition,this research work examines theoretical approaches to achieving optimized results using ANN.It mainly focuses on designing rules.The optimizing design approach of neural networks analyzes the internal process of the neural networks.Practices in developing the network are based on the interconnections among the hidden nodes and their learning parameters.The methodology is proven best for nonlinear resource allocation problems with a suitable design and complex issues.The ANN proposed here considers more or less 46k nodes hidden inside 49 million connections employed on full-fledged parallel processors.The proposed ANN offered optimal results in real-world application problems,and the results were obtained using MATLAB.
基金The 111 project(B07018) Supported by Program for Changjiang Scholars and Innovative Research Teamin University(IRT0423)
文摘To improve image processing speed and detection precision of a surface detection system on a strip surface,based on the analysis of the characteristics of image data and image processing in detection system on the strip surface,the design of parallel image processing system and the methods of algorithm implementation have been studied. By using field programmable gate array(FPGA) as hardware platform of implementation and considering the characteristic of detection system on the strip surface,a parallel image processing system implemented by using multi IP kernel is designed. According to different computing tasks and the load balancing capability of parallel processing system,the system could set different calculating numbers of nodes to meet the system's demand and save the hardware cost.
基金supported as part of the Center for Hierarchical Waste Form Materials,an Energy Frontier Research Center funded by the U.S.Department of Energy,Office of Science,Basic Energy Sciences under Award No.DE-SC0016574.
文摘Porous materials present significant advantages for absorbing radioactive isotopes in nuclear waste streams.To improve absorption efficiency in nuclear waste treatment,a thorough understanding of the diffusion-advection process within porous structures is essential for material design.In this study,we present advancements in the volumetric lattice Boltzmann method(VLBM)for modeling and simulating pore-scale diffusion-advection of radioactive isotopes within geopolymer porous structures.These structures are created using the phase field method(PFM)to precisely control pore architectures.In our VLBM approach,we introduce a concentration field of an isotope seamlessly coupled with the velocity field and solve it by the time evolution of its particle population function.To address the computational intensity inherent in the coupled lattice Boltzmann equations for velocity and concentration fields,we implement graphics processing unit(GPU)parallelization.Validation of the developed model involves examining the flow and diffusion fields in porous structures.Remarkably,good agreement is observed for both the velocity field from VLBM and multiphysics object-oriented simulation environment(MOOSE),and the concentration field from VLBM and the finite difference method(FDM).Furthermore,we investigate the effects of background flow,species diffusivity,and porosity on the diffusion-advection behavior by varying the background flow velocity,diffusion coefficient,and pore volume fraction,respectively.Notably,all three parameters exert an influence on the diffusion-advection process.Increased background flow and diffusivity markedly accelerate the process due to increased advection intensity and enhanced diffusion capability,respectively.Conversely,increasing the porosity has a less significant effect,causing a slight slowdown of the diffusion-advection process due to the expanded pore volume.This comprehensive parametric study provides valuable insights into the kinetics of isotope uptake in porous structures,facilitating the development of porous materials for nuclear waste treatment applications.
文摘An improvement detecting method was proposed according to the disadvantages of testing method of optical axes parallelism of shipboard photoelectrical theodolite (short for theodolite) based on image processing. Pointolite replaced 0.2'' collimator to reduce the errors of crosshair images processing and improve the quality of image. What’s more, the high quality images could help to optimize the image processing method and the testing accuracy. The errors between the trial results interpreted by software and the results tested in dock were less than 10'', which indicated the improve method had some actual application values.
基金the Knowledge-based Ship-designHyper-integrated Platform (KSHIP) of Ministry ofEducation, China
文摘The parallel processing based on the free running model test was adopted to predict the interaction force coefficients (flow straightening coefficient and wake fraction) of ship maneuvering. And the multipopulation genetic algorithm (MPGA) based on real coding that can contemporarily process the data of free running model and simulation of ship maneuvering was applied to solve the problem. Accordingly the optimal individual was obtained using the method of genetic algorithm. The parallel processing of multiopulation solved the prematurity in the identification for single population, meanwhile, the parallel processing of the data of ship maneuvering (turning motion and zigzag motion) is an attempt to solve the coefficient drift problem. In order to validate the method, the interaction force coefficients were verified by the procedure and these coefficients measured were compared with those ones identified. The maximum error is less than 5%, and the identification is an effective method.
基金the National Natural Science Foundation of China under Grant No.61370091 and No.61170200,Jiangsu Province Science and Technology Support Program (industry) Project under Grant No.BE2012179
文摘Along with the increasing Big Data challenges, the MapReduce based systems are extensively welcomed, because of their remarkable simplicity and scalability. However, from the first day MapReduce is proposed, its argument with parallel Dt3MSs never stops, as it over-focuses on the scalability but overlooks the efficiency. Accordingly, extended systems are proposed in order to improve the peDbrmance on the limited scale clusters. In the meantime, traditional RDBMS technologies like structured data model, transaction, SQL, etc. are also getting more attention. This paper reviews such systems, from Google and also the third parties, trying to indicate the directions for the future research.
基金National Natural Science Foundations of China( No. 61070101,No. 60875029,No. 61175048)
文摘Density-based algorithm for discovering clusters in large spatial databases with noise(DBSCAN) is a classic kind of density-based spatial clustering algorithm and is widely applied in several aspects due to good performance in capturing arbitrary shapes and detecting outliers. However, in practice, datasets are always too massive to fit the serial DBSCAN. And a new parallel algorithm-Parallel DBSCAN(PDBSCAN) was proposed to solve the problem which DBSCAN faced. The proposed parallel algorithm bases on MapReduce mechanism. The usage of parallel mechanism in the algorithm focuses on region query and candidate queue processing which needed substantive computation resources. As a result, PDBSCAN is scalable for large-scale dataset clustering and is extremely suitable for applications in E-Commence, especially for recommendation.
文摘Large range cell migration is a severe challenge to imaging algorithm for spaceborne SAR. Based on design of Finite Impulse Response (FIR) filter and Range Doppler (RD) algorithm, a realization of quick-look imaging for large range cell migration is proposed. It realized quick-look imaging of 8 times reduced resolution with parallel processing on memory shared 8 CPU SGI server. According to simulation experiment, this quick-look imaging algorithm with parallel processing can image 16384x16384 SAR raw data within 6 seconds. It reaches the requirement of real-time imaging.
文摘In this paper, according to the parallel environment of ELXSI computer, a parallel solving process of substructure method in static and dynamic analyses of large-scale and complex structure has been put forward, and the corresponding parallel computational program has been developed.
文摘On-line transient stability analysis of a power grid is crucial in determining whether the power grid will traverse to a steady state stable operating point after a disturbance. The transient stability analysis involves computing the solutions of the algebraic equations modeling the grid network and the ordinary differential equations modeling the dynamics of the electrical components like synchronous generators, exciters, governors, etc., of the grid in near real-time. In this research, we investigate the use of time-parallel approach in particular the Parareal algorithm implementation on Graphical Processing Unit using Compute Unified Device Architecture to compute solutions of ordinary differential equations. The numerical solution accuracy and computation time of the Parareal algorithm executing on the GPU are demonstrated on the single machine infinite bus test system. Two types of dynamic model of the single synchronous generator namely the classical and detailed models are studied. The numerical solutions of the ordinary differential equations computed by the Parareal algorithm are compared to that computed using the modified Euler’s method demonstrating the accuracy of the Parareal algorithm executing on GPU. Simulations are performed with varying numerical integration time steps, and the suitability of Parareal algorithm in computing near real-time solutions of ordinary different equations is presented. A speedup of 25× and 31× is achieved with the Parareal algorithm for classical and detailed dynamic models of the synchronous generator respectively compared to the sequential modified Euler’s method. The weak scaling efficiency of the Parareal algorithm when required to solve a large number of ordinary differential equations at each time step due to the increase in sequential computations and associated memory transfer latency between the CPU and GPU is discussed.
基金National Natural Science Foundation of China(No.51275502)Natural Science Key Project of Anhui Province(No.KJ2011A014)+1 种基金China Postdoctoral Science Foundation funded project(NO.2012M511416)The Innovation Foundationof Anhui University and the Personnel Construction Project of Anhui University
文摘In order to improve femtosecond laser throughput,a parallel processing system consisting of liquid crystal on silicon(LCOS)device as spatial light modulator is put forward.A method is described for displaying Fourier hologram on LCOS,and a high uniformity of several diffraction peaks in the computer reconstruction is achieved.Application of this method to the parallel femtosecond laser processing is also demonstrated,and two intersecting rings and three tangent rings are fabricated respectively by one time in the photoresist.
基金Supported by Project of National Natural Science Foundation(No.41874134)
文摘Processing large-scale 3-D gravity data is an important topic in geophysics field. Many existing inversion methods lack the competence of processing massive data and practical application capacity. This study proposes the application of GPU parallel processing technology to the focusing inversion method, aiming at improving the inversion accuracy while speeding up calculation and reducing the memory consumption, thus obtaining the fast and reliable inversion results for large complex model. In this paper, equivalent storage of geometric trellis is used to calculate the sensitivity matrix, and the inversion is based on GPU parallel computing technology. The parallel computing program that is optimized by reducing data transfer, access restrictions and instruction restrictions as well as latency hiding greatly reduces the memory usage, speeds up the calculation, and makes the fast inversion of large models possible. By comparing and analyzing the computing speed of traditional single thread CPU method and CUDA-based GPU parallel technology, the excellent acceleration performance of GPU parallel computing is verified, which provides ideas for practical application of some theoretical inversion methods restricted by computing speed and computer memory. The model test verifies that the focusing inversion method can overcome the problem of severe skin effect and ambiguity of geological body boundary. Moreover, the increase of the model cells and inversion data can more clearly depict the boundary position of the abnormal body and delineate its specific shape.
文摘This paper deals with a parallel processing uninterruptible power supply (UPS) for sudden voltage fluctuation in power management to integrate power quality improvement, load voltage stabilization and UPS. To reduce the complexity, cost and number of power conversions, which results in higher efficiency, only one voltage-controlled voltage source inverter (VCVSI) is used. The VCVSI is connected in series on the DC battery side and in parallel on the AC grid side with a decoupling inductor. The system provides sinusoidal voltage at the fundamental value of 220V/60Hz for the load during abnormal utility power conditions or grid failure. Also, the system can be operated to mitigate the harmonic current and voltage demand from nonlinear loads and provide voltage stabilization for loads when sudden voltage fluctuation occur, such as sag and swell. The experimental results confirm the system protects against outages caused by abnormal utility power conditions and sudden voltage fluctuations and change.