Suspicious mass traffic constantly evolves,making network behaviour tracing and structure more complex.Neural networks yield promising results by considering a sufficient number of processing elements with strong inte...Suspicious mass traffic constantly evolves,making network behaviour tracing and structure more complex.Neural networks yield promising results by considering a sufficient number of processing elements with strong interconnections between them.They offer efficient computational Hopfield neural networks models and optimization constraints used by undergoing a good amount of parallelism to yield optimal results.Artificial neural network(ANN)offers optimal solutions in classifying and clustering the various reels of data,and the results obtained purely depend on identifying a problem.In this research work,the design of optimized applications is presented in an organized manner.In addition,this research work examines theoretical approaches to achieving optimized results using ANN.It mainly focuses on designing rules.The optimizing design approach of neural networks analyzes the internal process of the neural networks.Practices in developing the network are based on the interconnections among the hidden nodes and their learning parameters.The methodology is proven best for nonlinear resource allocation problems with a suitable design and complex issues.The ANN proposed here considers more or less 46k nodes hidden inside 49 million connections employed on full-fledged parallel processors.The proposed ANN offered optimal results in real-world application problems,and the results were obtained using MATLAB.展开更多
The parallel processing based on the free running model test was adopted to predict the interaction force coefficients (flow straightening coefficient and wake fraction) of ship maneuvering. And the multipopulation ...The parallel processing based on the free running model test was adopted to predict the interaction force coefficients (flow straightening coefficient and wake fraction) of ship maneuvering. And the multipopulation genetic algorithm (MPGA) based on real coding that can contemporarily process the data of free running model and simulation of ship maneuvering was applied to solve the problem. Accordingly the optimal individual was obtained using the method of genetic algorithm. The parallel processing of multiopulation solved the prematurity in the identification for single population, meanwhile, the parallel processing of the data of ship maneuvering (turning motion and zigzag motion) is an attempt to solve the coefficient drift problem. In order to validate the method, the interaction force coefficients were verified by the procedure and these coefficients measured were compared with those ones identified. The maximum error is less than 5%, and the identification is an effective method.展开更多
Along with the increasing Big Data challenges, the MapReduce based systems are extensively welcomed, because of their remarkable simplicity and scalability. However, from the first day MapReduce is proposed, its a...Along with the increasing Big Data challenges, the MapReduce based systems are extensively welcomed, because of their remarkable simplicity and scalability. However, from the first day MapReduce is proposed, its argument with parallel Dt3MSs never stops, as it over-focuses on the scalability but overlooks the efficiency. Accordingly, extended systems are proposed in order to improve the peDbrmance on the limited scale clusters. In the meantime, traditional RDBMS technologies like structured data model, transaction, SQL, etc. are also getting more attention. This paper reviews such systems, from Google and also the third parties, trying to indicate the directions for the future research.展开更多
Large range cell migration is a severe challenge to imaging algorithm for spaceborne SAR. Based on design of Finite Impulse Response (FIR) filter and Range Doppler (RD) algorithm, a realization of quick-look imaging f...Large range cell migration is a severe challenge to imaging algorithm for spaceborne SAR. Based on design of Finite Impulse Response (FIR) filter and Range Doppler (RD) algorithm, a realization of quick-look imaging for large range cell migration is proposed. It realized quick-look imaging of 8 times reduced resolution with parallel processing on memory shared 8 CPU SGI server. According to simulation experiment, this quick-look imaging algorithm with parallel processing can image 16384x16384 SAR raw data within 6 seconds. It reaches the requirement of real-time imaging.展开更多
The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Co...The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Consequently,the single-core processor cannot meet the requirements of LTE system.This paper analyzes how to use multi-core processors to achieve parallel processing of uplink demodulation and decoding in LTE systems and designs an approach to parallel processing.The test results prove that this approach works quite well.展开更多
The k-Nearest Neighbor method is one of the most popular techniques for both classification and regression purposes.Because of its operation,the application of this classification may be limited to problems with a cer...The k-Nearest Neighbor method is one of the most popular techniques for both classification and regression purposes.Because of its operation,the application of this classification may be limited to problems with a certain number of instances,particularly,when run time is a consideration.However,the classification of large amounts of data has become a fundamental task in many real-world applications.It is logical to scale the k-Nearest Neighbor method to large scale datasets.This paper proposes a new k-Nearest Neighbor classification method(KNN-CCL)which uses a parallel centroid-based and hierarchical clustering algorithm to separate the sample of training dataset into multiple parts.The introduced clustering algorithm uses four stages of successive refinements and generates high quality clusters.The k-Nearest Neighbor approach subsequently makes use of them to predict the test datasets.Finally,sets of experiments are conducted on the UCI datasets.The experimental results confirm that the proposed k-Nearest Neighbor classification method performs well with regard to classification accuracy and performance.展开更多
ADSP-TS101 is a high performance DSP with good properties of parallel processing and high speed.According to the real-time processing requirements of underwater acoustic communication algorithms,a real-time parallel p...ADSP-TS101 is a high performance DSP with good properties of parallel processing and high speed.According to the real-time processing requirements of underwater acoustic communication algorithms,a real-time parallel processing system with multi-channel synchronous sample,which is composed of multiple ADSP-TS101s,is designed and carried out.For the hardware design,field programmable gate array(FPGA)logical control is adopted for the design of multi-channel synchronous sample module and cluster/data flow associated pin connection mode is adopted for multiprocessing parallel processing configuration respectively.And the software is optimized by two kinds of communication ways:broadcast writing way through shared bus and point-to-point way through link ports.Through the whole system installation,connective debugging,and experiments in a lake,the results show that the real-time parallel processing system has good stability and real-time processing capability and meets the technical design requirements of real-time processing.展开更多
This paper deals with a parallel processing uninterruptible power supply (UPS) for sudden voltage fluctuation in power management to integrate power quality improvement, load voltage stabilization and UPS. To reduce t...This paper deals with a parallel processing uninterruptible power supply (UPS) for sudden voltage fluctuation in power management to integrate power quality improvement, load voltage stabilization and UPS. To reduce the complexity, cost and number of power conversions, which results in higher efficiency, only one voltage-controlled voltage source inverter (VCVSI) is used. The VCVSI is connected in series on the DC battery side and in parallel on the AC grid side with a decoupling inductor. The system provides sinusoidal voltage at the fundamental value of 220V/60Hz for the load during abnormal utility power conditions or grid failure. Also, the system can be operated to mitigate the harmonic current and voltage demand from nonlinear loads and provide voltage stabilization for loads when sudden voltage fluctuation occur, such as sag and swell. The experimental results confirm the system protects against outages caused by abnormal utility power conditions and sudden voltage fluctuations and change.展开更多
In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularl...In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularly noteworthy in the field of image processing, which witnessed significant advancements. This parallel computing project explored the field of parallel image processing, with a focus on the grayscale conversion of colorful images. Our approach involved integrating OpenMP into our framework for parallelization to execute a critical image processing task: grayscale conversion. By using OpenMP, we strategically enhanced the overall performance of the conversion process by distributing the workload across multiple threads. The primary objectives of our project revolved around optimizing computation time and improving overall efficiency, particularly in the task of grayscale conversion of colorful images. Utilizing OpenMP for concurrent processing across multiple cores significantly reduced execution times through the effective distribution of tasks among these cores. The speedup values for various image sizes highlighted the efficacy of parallel processing, especially for large images. However, a detailed examination revealed a potential decline in parallelization efficiency with an increasing number of cores. This underscored the importance of a carefully optimized parallelization strategy, considering factors like load balancing and minimizing communication overhead. Despite challenges, the overall scalability and efficiency achieved with parallel image processing underscored OpenMP’s effectiveness in accelerating image manipulation tasks.展开更多
Real-time capabilities and computational efficiency are provided by parallel image processing utilizing OpenMP. However, race conditions can affect the accuracy and reliability of the outcomes. This paper highlights t...Real-time capabilities and computational efficiency are provided by parallel image processing utilizing OpenMP. However, race conditions can affect the accuracy and reliability of the outcomes. This paper highlights the importance of addressing race conditions in parallel image processing, specifically focusing on color inverse filtering using OpenMP. We considered three solutions to solve race conditions, each with distinct characteristics: #pragma omp atomic: Protects individual memory operations for fine-grained control. #pragma omp critical: Protects entire code blocks for exclusive access. #pragma omp parallel sections reduction: Employs a reduction clause for safe aggregation of values across threads. Our findings show that the produced images were unaffected by race condition. However, it becomes evident that solving the race conditions in the code makes it significantly faster, especially when it is executed on multiple cores.展开更多
Using the method of mathematical morphology,this paper fulfills filtration,segmentation and extraction of morphological features of the satellite cloud image.It also gives out the relative algorithms,which is realized...Using the method of mathematical morphology,this paper fulfills filtration,segmentation and extraction of morphological features of the satellite cloud image.It also gives out the relative algorithms,which is realized by parallel C programming based on Transputer networks.It has been successfully used to process the typhoon and the low tornado cloud image.And it will be used in weather forecast.展开更多
In H.264,computational complexity and memory access of deblocking filters are variable,dependent on video contents.This paper proposes a VLSI architecture of deblocking filters with adaptive dynamic power,which avoids...In H.264,computational complexity and memory access of deblocking filters are variable,dependent on video contents.This paper proposes a VLSI architecture of deblocking filters with adaptive dynamic power,which avoids redundant computations and memory accesses by precluding the blocks that can be skipped.The vertical and horizontal edges are simulta-neously processed in an advanced scan order to speed up the decoder.As a result,dynamic power of the proposed architecture can be reduced adaptively(up to about 89%) for different videos,and the off-chip memory access is improved when compared to previous designs.Moreover,the processing capability of the proposed architecture is in particular appropriate for real-time deblocking of high-definition television(HDTV,1920×1080 pixels/frame,60 frames/s video signals) video operation at 62 MHz.Using the proposed architecture,power can be reduced by up to about 89% and processing time by from 25% to 81% compared with previous designs.展开更多
Purpose-The purpose of this paper is to introduce new implementations for parallel processing applications using bijective systolic networks and the corresponding carbon-based field emission controlled switching.The d...Purpose-The purpose of this paper is to introduce new implementations for parallel processing applications using bijective systolic networks and the corresponding carbon-based field emission controlled switching.The developed implementations are performed in the reversible domain to perform the required bijective parallel computing,where the implementations for parallel computations that utilize the presented field-emission controlled switching and their corresponding m-ary(many-valued)extensions for the use in nano systolic networks are introduced.The first part of the paper presents important fundamentals with regards to systolic computing and carbon-based field emission that will be utilized in the implementations within the second part of the paper.Design/methodology/approach-The introduced systolic systems utilize recent findings in field emission and nano applications to implement the functionality of the basic bijective systolic network.This includes many-valued systolic computing via field emission techniques using carbon-based nanotubes and nanotips.The realization of bijective logic circuits in current and emerging technologies can be very important for various reasons.The reduction of power consumption is a major requirement for the circuit design in future technologies,and thus,the new nano systolic circuits can play an important role in the design of circuits that consume minimal power for future applications such as in low-power signal processing.In addition,the implemented bijective systems can be utilized to implement massive parallel processing and thus obtaining very high processing performance,where the implementation will also utilize the significant size reduction within the nano domain.The extensions of implementations to field emission-based many-valued systolic networks using the introduced bijective nano systolic architectures are also presented.Findings-Novel bijective systolic architectures using nano-based field emission implementations are introduced in this paper,and the implementation using the general scheme of many-valued computing is presented.The carbon-based field emission implementation of nano systolic networks is also introduced.This is accomplished using the introduced field emission carbon-based devices,where field emission from carbon nanotubes and nano-apex carbon fibers is utilized.The implementations of the many-valued bijective systolic networks utilizing the introduced nano-based architectures are also presented.Originality/value-The introduced bijective systolic implementations form new important directions in the systolic realizations using the newly emerging nano-based technologies.The 2-to-1 multiplexer is a basic building block in“switch logic,”where in switch logic,a logic circuit is realized as a combination of switches rather than a combination of logic gates as in the gate logic,which proves to be less costly in synthesizing multiplexer-based wide variety of modern circuits and systems since nano implementations exist in very compact space where carbon-based devices switch reliably using much less power than silicon-based devices.The introduced implementations for nano systolic computation are new and interesting for the design in future nanotechnologies that require optimal design specifications of minimum power consumption and minimum size layout such as in low-power control of autonomous robots and in the adiabatic low-power very-large-scale-integration circuit design for signal processing applications.展开更多
Purpose–The purpose of this paper is to introduce new implementations for parallel processing applications using bijective systolic networks and their corresponding carbon-based field emission controlled switching.Th...Purpose–The purpose of this paper is to introduce new implementations for parallel processing applications using bijective systolic networks and their corresponding carbon-based field emission controlled switching.The developed implementations are performed in the reversible domain to perform the required bijective parallel computing,where the implementations for parallel computations that utilize the presented field-emission controlled switching and their corresponding many-valued(m-ary)extensions for the use in nano systolic networks are introduced.The second part of the paper introduces the implementation of systolic computing using two-to-one controlled switching via carbon-based field emission that were presented in the first part of the paper,and the computational extension to the general case of many-valued(m-ary)systolic networks utilizing many-to-one carbon-based field emission is also introduced.Design/methodology/approach–The introduced systolic systems utilize recent findings in field emission and nano applications to implement the functionality of the basic bijective systolic network.This includes many-valued systolic computing via field-emission techniques using carbon-based nanotubes and nanotips.The realization of bijective logic circuits in current and emerging technologies can be very important for various reasons.The reduction of power consumption is a major requirement for the circuit design in future technologies,and thus,the new nano systolic circuits can play an important role in the design of circuits that consume minimal power for future applications such as in low-power signal processing.In addition,the implemented bijective systems can be utilized to implement massive parallel processing and thus obtaining very high processing performance,where the implementation will also utilize the significant size reduction within the nano domain.The extensions of implementations to field emission-based many-valued systolic networks using the introduced bijective nano systolic architectures are also presented.Findings–Novel bijective systolic architectures using nano-based field emission implementations are introduced in this paper,and the implementation using the general scheme of many-valued computing is presented.The carbon-based field emission implementation of nano systolic networks is also introduced.This is accomplished using the introduced field-emission carbon-based devices,where field emission from carbon nanotubes and nano-apex carbon fibersisutilized.The implementationsof the many-valued bijective systolic networks utilizing the introduced nano-based architectures are also presented.Practical implications–The introduced bijective systolic implementations form new important directions in the systolic realizations using the newly emerging nano-based technologies.The 2-to-1 multiplexer is a basic building block in“switch logic,”where in switch logic,a logic circuit is realized as a combination of switches rather than a combination of logic gates as in the gate logic,which proves to be less costly in synthesizing multiplexer-based wide variety of modern circuits and systems since nano implementations exist in very compact space where carbon-based devices switch reliably using much less power than silicon-based devices.The introduced implementations for nano systolic computation are new and interesting for the design in future nanotechnologies that require optimal design specifications of minimum power consumption and minimum size layout such as in low-power control of autonomous robots and in the adiabatic low-power VLSI circuit design for signal processing applications.Originality/value–The introduced bijective systolic implementations form new important directions in the systolic realizations utilizing the newly emerging nanotechnologies.The introduced implementations for nano systolic computation are new and interesting for the design in future nanotechnologies that require optimal design specifications of high performance,minimum power and minimum size.展开更多
In order to improve femtosecond laser throughput,a parallel processing system consisting of liquid crystal on silicon(LCOS)device as spatial light modulator is put forward.A method is described for displaying Fourier ...In order to improve femtosecond laser throughput,a parallel processing system consisting of liquid crystal on silicon(LCOS)device as spatial light modulator is put forward.A method is described for displaying Fourier hologram on LCOS,and a high uniformity of several diffraction peaks in the computer reconstruction is achieved.Application of this method to the parallel femtosecond laser processing is also demonstrated,and two intersecting rings and three tangent rings are fabricated respectively by one time in the photoresist.展开更多
The paper presents the implementation of a parallel version of FDK (Felkamp, David e Kress) algorithm using graphics processing units. Discussion was briefly some elements the computed tomographic scan and FDK algor...The paper presents the implementation of a parallel version of FDK (Felkamp, David e Kress) algorithm using graphics processing units. Discussion was briefly some elements the computed tomographic scan and FDK algorithm; and some ideas about GPUs (Graphics Processing Units) and its use in general purpose computing were presented. The paper shows a computational implementation of FDK algorithm and the process of parallelization of this implementation. Compare the parallel version of the algorithm with the sequential version, used speedup as a performance metric. To evaluate the performance of parallel version, two GPUs, GeForce 9400GT (16 cores) a low capacity GPU and Quadro 2000 (192 cores) a medium capacity GPU was reached speedup of 3.37.展开更多
In order to meet the demands of high efficient and real-time computer assisted diagnosis as well as screening in medical area, to improve the efficacy of parallel medical image processing is of great importance. This ...In order to meet the demands of high efficient and real-time computer assisted diagnosis as well as screening in medical area, to improve the efficacy of parallel medical image processing is of great importance. This article proposes improved strategies for parallel medical image processing applications,which is categorized into two genera. For each genus individual strategy is devised, including the theoretic algorithm for minimizing the exertion time. Experiment using mammograms not only justifies the validity of the theoretic analysis, with reasonable difference between the theoretic and measured value, but also shows that when adopting the improved strategies, efficacy of medical image parallel processing is improved greatly.展开更多
The finite element method is a key player in computational electromag-netics for designing RF(Radio Frequency)components such as waveguides.The frequency-domain analysis is fundamental to identify the characteristics ...The finite element method is a key player in computational electromag-netics for designing RF(Radio Frequency)components such as waveguides.The frequency-domain analysis is fundamental to identify the characteristics of the components.For the conventional frequency-domain electromagnetic analysis using FEM(Finite Element Method),the system matrix is complex-numbered as well as indefinite.The iterative solvers can be faster than the direct solver when the solver convergence is guaranteed and done in a few steps.However,such complex-numbered and indefinite systems are hard to exploit the merit of the iterative solver.It is also hard to benefit from matrix factorization techniques due to varying system matrix parts according to frequency.Overall,it is hard to adopt conventional iterative solvers even though the system matrix is sparse.A new parallel iterative FEM solver for frequency domain analysis is implemented for inhomogeneous waveguide structures in this paper.In this implementation,the previous solution of the iterative solver of Matlab(Matrix Laboratory)employ-ing the preconditioner is used for the initial guess for the next step’s solution process.The overlapped parallel stage using Matlab’s Parallel Computing Toolbox is also proposed to alleviate the cold starting,which ruins the convergence of early steps in each parallel stage.Numerical experiments based on waveguide structures have demonstrated the accuracy and efficiency of the proposed scheme.展开更多
Three parallel anaerobic-anoxic/anaerobic-aerobic (AN/AO) processes were developed to enrich denitrifying phosphorus removal bacteria (DPB) for low strength wastewater treatment. The main body of the parallel AN/A...Three parallel anaerobic-anoxic/anaerobic-aerobic (AN/AO) processes were developed to enrich denitrifying phosphorus removal bacteria (DPB) for low strength wastewater treatment. The main body of the parallel AN/AO process consists of an AN (anaerobic-anoxic) process and an AO (anaerobic-aerobic) process. In the AO process, the common phosphorus accumulating organisms (PAOs) was dominate, while in the AN process, DPB was dominate, The volume of anaerobic zone(Vana):anoxie zone(Vano) : aerobic zone (Vaer) for the parallel AN/AO process is 1:1:1 in contrast with a Vana:Vaer and Vano:Vaer of 1:2 and 1:4 for a traditional biological nutrient removal process (BNR). Process 3 excels in the 3 processes on the basis of COD, TN and TP removal. For 4 month operation, the effluent COD concentration of process 3 did not exceed 60 mg/L; the effluent TN concentration of process 3 was lower than 15 mg/L; and the effluent TP concentration of process 3 was lower than 1 mg/L.展开更多
To improve image processing speed and detection precision of a surface detection system on a strip surface,based on the analysis of the characteristics of image data and image processing in detection system on the str...To improve image processing speed and detection precision of a surface detection system on a strip surface,based on the analysis of the characteristics of image data and image processing in detection system on the strip surface,the design of parallel image processing system and the methods of algorithm implementation have been studied. By using field programmable gate array(FPGA) as hardware platform of implementation and considering the characteristic of detection system on the strip surface,a parallel image processing system implemented by using multi IP kernel is designed. According to different computing tasks and the load balancing capability of parallel processing system,the system could set different calculating numbers of nodes to meet the system's demand and save the hardware cost.展开更多
基金This research is funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R 151)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Suspicious mass traffic constantly evolves,making network behaviour tracing and structure more complex.Neural networks yield promising results by considering a sufficient number of processing elements with strong interconnections between them.They offer efficient computational Hopfield neural networks models and optimization constraints used by undergoing a good amount of parallelism to yield optimal results.Artificial neural network(ANN)offers optimal solutions in classifying and clustering the various reels of data,and the results obtained purely depend on identifying a problem.In this research work,the design of optimized applications is presented in an organized manner.In addition,this research work examines theoretical approaches to achieving optimized results using ANN.It mainly focuses on designing rules.The optimizing design approach of neural networks analyzes the internal process of the neural networks.Practices in developing the network are based on the interconnections among the hidden nodes and their learning parameters.The methodology is proven best for nonlinear resource allocation problems with a suitable design and complex issues.The ANN proposed here considers more or less 46k nodes hidden inside 49 million connections employed on full-fledged parallel processors.The proposed ANN offered optimal results in real-world application problems,and the results were obtained using MATLAB.
基金the Knowledge-based Ship-designHyper-integrated Platform (KSHIP) of Ministry ofEducation, China
文摘The parallel processing based on the free running model test was adopted to predict the interaction force coefficients (flow straightening coefficient and wake fraction) of ship maneuvering. And the multipopulation genetic algorithm (MPGA) based on real coding that can contemporarily process the data of free running model and simulation of ship maneuvering was applied to solve the problem. Accordingly the optimal individual was obtained using the method of genetic algorithm. The parallel processing of multiopulation solved the prematurity in the identification for single population, meanwhile, the parallel processing of the data of ship maneuvering (turning motion and zigzag motion) is an attempt to solve the coefficient drift problem. In order to validate the method, the interaction force coefficients were verified by the procedure and these coefficients measured were compared with those ones identified. The maximum error is less than 5%, and the identification is an effective method.
基金the National Natural Science Foundation of China under Grant No.61370091 and No.61170200,Jiangsu Province Science and Technology Support Program (industry) Project under Grant No.BE2012179
文摘Along with the increasing Big Data challenges, the MapReduce based systems are extensively welcomed, because of their remarkable simplicity and scalability. However, from the first day MapReduce is proposed, its argument with parallel Dt3MSs never stops, as it over-focuses on the scalability but overlooks the efficiency. Accordingly, extended systems are proposed in order to improve the peDbrmance on the limited scale clusters. In the meantime, traditional RDBMS technologies like structured data model, transaction, SQL, etc. are also getting more attention. This paper reviews such systems, from Google and also the third parties, trying to indicate the directions for the future research.
文摘Large range cell migration is a severe challenge to imaging algorithm for spaceborne SAR. Based on design of Finite Impulse Response (FIR) filter and Range Doppler (RD) algorithm, a realization of quick-look imaging for large range cell migration is proposed. It realized quick-look imaging of 8 times reduced resolution with parallel processing on memory shared 8 CPU SGI server. According to simulation experiment, this quick-look imaging algorithm with parallel processing can image 16384x16384 SAR raw data within 6 seconds. It reaches the requirement of real-time imaging.
文摘The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Consequently,the single-core processor cannot meet the requirements of LTE system.This paper analyzes how to use multi-core processors to achieve parallel processing of uplink demodulation and decoding in LTE systems and designs an approach to parallel processing.The test results prove that this approach works quite well.
基金The authors received no specific funding for this work.
文摘The k-Nearest Neighbor method is one of the most popular techniques for both classification and regression purposes.Because of its operation,the application of this classification may be limited to problems with a certain number of instances,particularly,when run time is a consideration.However,the classification of large amounts of data has become a fundamental task in many real-world applications.It is logical to scale the k-Nearest Neighbor method to large scale datasets.This paper proposes a new k-Nearest Neighbor classification method(KNN-CCL)which uses a parallel centroid-based and hierarchical clustering algorithm to separate the sample of training dataset into multiple parts.The introduced clustering algorithm uses four stages of successive refinements and generates high quality clusters.The k-Nearest Neighbor approach subsequently makes use of them to predict the test datasets.Finally,sets of experiments are conducted on the UCI datasets.The experimental results confirm that the proposed k-Nearest Neighbor classification method performs well with regard to classification accuracy and performance.
基金Sponsored by National Natural Science Foundation of China(60572098)
文摘ADSP-TS101 is a high performance DSP with good properties of parallel processing and high speed.According to the real-time processing requirements of underwater acoustic communication algorithms,a real-time parallel processing system with multi-channel synchronous sample,which is composed of multiple ADSP-TS101s,is designed and carried out.For the hardware design,field programmable gate array(FPGA)logical control is adopted for the design of multi-channel synchronous sample module and cluster/data flow associated pin connection mode is adopted for multiprocessing parallel processing configuration respectively.And the software is optimized by two kinds of communication ways:broadcast writing way through shared bus and point-to-point way through link ports.Through the whole system installation,connective debugging,and experiments in a lake,the results show that the real-time parallel processing system has good stability and real-time processing capability and meets the technical design requirements of real-time processing.
文摘This paper deals with a parallel processing uninterruptible power supply (UPS) for sudden voltage fluctuation in power management to integrate power quality improvement, load voltage stabilization and UPS. To reduce the complexity, cost and number of power conversions, which results in higher efficiency, only one voltage-controlled voltage source inverter (VCVSI) is used. The VCVSI is connected in series on the DC battery side and in parallel on the AC grid side with a decoupling inductor. The system provides sinusoidal voltage at the fundamental value of 220V/60Hz for the load during abnormal utility power conditions or grid failure. Also, the system can be operated to mitigate the harmonic current and voltage demand from nonlinear loads and provide voltage stabilization for loads when sudden voltage fluctuation occur, such as sag and swell. The experimental results confirm the system protects against outages caused by abnormal utility power conditions and sudden voltage fluctuations and change.
文摘In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularly noteworthy in the field of image processing, which witnessed significant advancements. This parallel computing project explored the field of parallel image processing, with a focus on the grayscale conversion of colorful images. Our approach involved integrating OpenMP into our framework for parallelization to execute a critical image processing task: grayscale conversion. By using OpenMP, we strategically enhanced the overall performance of the conversion process by distributing the workload across multiple threads. The primary objectives of our project revolved around optimizing computation time and improving overall efficiency, particularly in the task of grayscale conversion of colorful images. Utilizing OpenMP for concurrent processing across multiple cores significantly reduced execution times through the effective distribution of tasks among these cores. The speedup values for various image sizes highlighted the efficacy of parallel processing, especially for large images. However, a detailed examination revealed a potential decline in parallelization efficiency with an increasing number of cores. This underscored the importance of a carefully optimized parallelization strategy, considering factors like load balancing and minimizing communication overhead. Despite challenges, the overall scalability and efficiency achieved with parallel image processing underscored OpenMP’s effectiveness in accelerating image manipulation tasks.
文摘Real-time capabilities and computational efficiency are provided by parallel image processing utilizing OpenMP. However, race conditions can affect the accuracy and reliability of the outcomes. This paper highlights the importance of addressing race conditions in parallel image processing, specifically focusing on color inverse filtering using OpenMP. We considered three solutions to solve race conditions, each with distinct characteristics: #pragma omp atomic: Protects individual memory operations for fine-grained control. #pragma omp critical: Protects entire code blocks for exclusive access. #pragma omp parallel sections reduction: Employs a reduction clause for safe aggregation of values across threads. Our findings show that the produced images were unaffected by race condition. However, it becomes evident that solving the race conditions in the code makes it significantly faster, especially when it is executed on multiple cores.
文摘Using the method of mathematical morphology,this paper fulfills filtration,segmentation and extraction of morphological features of the satellite cloud image.It also gives out the relative algorithms,which is realized by parallel C programming based on Transputer networks.It has been successfully used to process the typhoon and the low tornado cloud image.And it will be used in weather forecast.
基金Project (No. NSS’USA5978) supported by the National Science Foundation of the United States under the East Asia Pacific Program
文摘In H.264,computational complexity and memory access of deblocking filters are variable,dependent on video contents.This paper proposes a VLSI architecture of deblocking filters with adaptive dynamic power,which avoids redundant computations and memory accesses by precluding the blocks that can be skipped.The vertical and horizontal edges are simulta-neously processed in an advanced scan order to speed up the decoder.As a result,dynamic power of the proposed architecture can be reduced adaptively(up to about 89%) for different videos,and the off-chip memory access is improved when compared to previous designs.Moreover,the processing capability of the proposed architecture is in particular appropriate for real-time deblocking of high-definition television(HDTV,1920×1080 pixels/frame,60 frames/s video signals) video operation at 62 MHz.Using the proposed architecture,power can be reduced by up to about 89% and processing time by from 25% to 81% compared with previous designs.
基金This research was performed during sabbatical leave in 2015-2016 granted to the author from The University of Jordan and spent at Philadelphia University.
文摘Purpose-The purpose of this paper is to introduce new implementations for parallel processing applications using bijective systolic networks and the corresponding carbon-based field emission controlled switching.The developed implementations are performed in the reversible domain to perform the required bijective parallel computing,where the implementations for parallel computations that utilize the presented field-emission controlled switching and their corresponding m-ary(many-valued)extensions for the use in nano systolic networks are introduced.The first part of the paper presents important fundamentals with regards to systolic computing and carbon-based field emission that will be utilized in the implementations within the second part of the paper.Design/methodology/approach-The introduced systolic systems utilize recent findings in field emission and nano applications to implement the functionality of the basic bijective systolic network.This includes many-valued systolic computing via field emission techniques using carbon-based nanotubes and nanotips.The realization of bijective logic circuits in current and emerging technologies can be very important for various reasons.The reduction of power consumption is a major requirement for the circuit design in future technologies,and thus,the new nano systolic circuits can play an important role in the design of circuits that consume minimal power for future applications such as in low-power signal processing.In addition,the implemented bijective systems can be utilized to implement massive parallel processing and thus obtaining very high processing performance,where the implementation will also utilize the significant size reduction within the nano domain.The extensions of implementations to field emission-based many-valued systolic networks using the introduced bijective nano systolic architectures are also presented.Findings-Novel bijective systolic architectures using nano-based field emission implementations are introduced in this paper,and the implementation using the general scheme of many-valued computing is presented.The carbon-based field emission implementation of nano systolic networks is also introduced.This is accomplished using the introduced field emission carbon-based devices,where field emission from carbon nanotubes and nano-apex carbon fibers is utilized.The implementations of the many-valued bijective systolic networks utilizing the introduced nano-based architectures are also presented.Originality/value-The introduced bijective systolic implementations form new important directions in the systolic realizations using the newly emerging nano-based technologies.The 2-to-1 multiplexer is a basic building block in“switch logic,”where in switch logic,a logic circuit is realized as a combination of switches rather than a combination of logic gates as in the gate logic,which proves to be less costly in synthesizing multiplexer-based wide variety of modern circuits and systems since nano implementations exist in very compact space where carbon-based devices switch reliably using much less power than silicon-based devices.The introduced implementations for nano systolic computation are new and interesting for the design in future nanotechnologies that require optimal design specifications of minimum power consumption and minimum size layout such as in low-power control of autonomous robots and in the adiabatic low-power very-large-scale-integration circuit design for signal processing applications.
文摘Purpose–The purpose of this paper is to introduce new implementations for parallel processing applications using bijective systolic networks and their corresponding carbon-based field emission controlled switching.The developed implementations are performed in the reversible domain to perform the required bijective parallel computing,where the implementations for parallel computations that utilize the presented field-emission controlled switching and their corresponding many-valued(m-ary)extensions for the use in nano systolic networks are introduced.The second part of the paper introduces the implementation of systolic computing using two-to-one controlled switching via carbon-based field emission that were presented in the first part of the paper,and the computational extension to the general case of many-valued(m-ary)systolic networks utilizing many-to-one carbon-based field emission is also introduced.Design/methodology/approach–The introduced systolic systems utilize recent findings in field emission and nano applications to implement the functionality of the basic bijective systolic network.This includes many-valued systolic computing via field-emission techniques using carbon-based nanotubes and nanotips.The realization of bijective logic circuits in current and emerging technologies can be very important for various reasons.The reduction of power consumption is a major requirement for the circuit design in future technologies,and thus,the new nano systolic circuits can play an important role in the design of circuits that consume minimal power for future applications such as in low-power signal processing.In addition,the implemented bijective systems can be utilized to implement massive parallel processing and thus obtaining very high processing performance,where the implementation will also utilize the significant size reduction within the nano domain.The extensions of implementations to field emission-based many-valued systolic networks using the introduced bijective nano systolic architectures are also presented.Findings–Novel bijective systolic architectures using nano-based field emission implementations are introduced in this paper,and the implementation using the general scheme of many-valued computing is presented.The carbon-based field emission implementation of nano systolic networks is also introduced.This is accomplished using the introduced field-emission carbon-based devices,where field emission from carbon nanotubes and nano-apex carbon fibersisutilized.The implementationsof the many-valued bijective systolic networks utilizing the introduced nano-based architectures are also presented.Practical implications–The introduced bijective systolic implementations form new important directions in the systolic realizations using the newly emerging nano-based technologies.The 2-to-1 multiplexer is a basic building block in“switch logic,”where in switch logic,a logic circuit is realized as a combination of switches rather than a combination of logic gates as in the gate logic,which proves to be less costly in synthesizing multiplexer-based wide variety of modern circuits and systems since nano implementations exist in very compact space where carbon-based devices switch reliably using much less power than silicon-based devices.The introduced implementations for nano systolic computation are new and interesting for the design in future nanotechnologies that require optimal design specifications of minimum power consumption and minimum size layout such as in low-power control of autonomous robots and in the adiabatic low-power VLSI circuit design for signal processing applications.Originality/value–The introduced bijective systolic implementations form new important directions in the systolic realizations utilizing the newly emerging nanotechnologies.The introduced implementations for nano systolic computation are new and interesting for the design in future nanotechnologies that require optimal design specifications of high performance,minimum power and minimum size.
基金National Natural Science Foundation of China(No.51275502)Natural Science Key Project of Anhui Province(No.KJ2011A014)+1 种基金China Postdoctoral Science Foundation funded project(NO.2012M511416)The Innovation Foundationof Anhui University and the Personnel Construction Project of Anhui University
文摘In order to improve femtosecond laser throughput,a parallel processing system consisting of liquid crystal on silicon(LCOS)device as spatial light modulator is put forward.A method is described for displaying Fourier hologram on LCOS,and a high uniformity of several diffraction peaks in the computer reconstruction is achieved.Application of this method to the parallel femtosecond laser processing is also demonstrated,and two intersecting rings and three tangent rings are fabricated respectively by one time in the photoresist.
文摘The paper presents the implementation of a parallel version of FDK (Felkamp, David e Kress) algorithm using graphics processing units. Discussion was briefly some elements the computed tomographic scan and FDK algorithm; and some ideas about GPUs (Graphics Processing Units) and its use in general purpose computing were presented. The paper shows a computational implementation of FDK algorithm and the process of parallelization of this implementation. Compare the parallel version of the algorithm with the sequential version, used speedup as a performance metric. To evaluate the performance of parallel version, two GPUs, GeForce 9400GT (16 cores) a low capacity GPU and Quadro 2000 (192 cores) a medium capacity GPU was reached speedup of 3.37.
基金SEC E-Institute:Shanghai High Institutions Grid Project, National Natural Science Foundation of ChinaGrant number: No.60503039,10778604 and 60773148+1 种基金China’s National Fundamenfal Research 973 ProgramGrant number:2004CB217903
文摘In order to meet the demands of high efficient and real-time computer assisted diagnosis as well as screening in medical area, to improve the efficacy of parallel medical image processing is of great importance. This article proposes improved strategies for parallel medical image processing applications,which is categorized into two genera. For each genus individual strategy is devised, including the theoretic algorithm for minimizing the exertion time. Experiment using mammograms not only justifies the validity of the theoretic analysis, with reasonable difference between the theoretic and measured value, but also shows that when adopting the improved strategies, efficacy of medical image parallel processing is improved greatly.
基金supported by Institute of Information&communications Technology Planning&Evaluation(ITP)grant funded by the Korea govermment(MSIT)(No.2019-0-00098,Advanced and Integrated Software Development for Electromagnetic Analysis)supported by Research Assistance Program(2021)in the Incheon National University.
文摘The finite element method is a key player in computational electromag-netics for designing RF(Radio Frequency)components such as waveguides.The frequency-domain analysis is fundamental to identify the characteristics of the components.For the conventional frequency-domain electromagnetic analysis using FEM(Finite Element Method),the system matrix is complex-numbered as well as indefinite.The iterative solvers can be faster than the direct solver when the solver convergence is guaranteed and done in a few steps.However,such complex-numbered and indefinite systems are hard to exploit the merit of the iterative solver.It is also hard to benefit from matrix factorization techniques due to varying system matrix parts according to frequency.Overall,it is hard to adopt conventional iterative solvers even though the system matrix is sparse.A new parallel iterative FEM solver for frequency domain analysis is implemented for inhomogeneous waveguide structures in this paper.In this implementation,the previous solution of the iterative solver of Matlab(Matrix Laboratory)employ-ing the preconditioner is used for the initial guess for the next step’s solution process.The overlapped parallel stage using Matlab’s Parallel Computing Toolbox is also proposed to alleviate the cold starting,which ruins the convergence of early steps in each parallel stage.Numerical experiments based on waveguide structures have demonstrated the accuracy and efficiency of the proposed scheme.
基金The Shuguang Program of Shanghai Education Committee (No. 03SG20)
文摘Three parallel anaerobic-anoxic/anaerobic-aerobic (AN/AO) processes were developed to enrich denitrifying phosphorus removal bacteria (DPB) for low strength wastewater treatment. The main body of the parallel AN/AO process consists of an AN (anaerobic-anoxic) process and an AO (anaerobic-aerobic) process. In the AO process, the common phosphorus accumulating organisms (PAOs) was dominate, while in the AN process, DPB was dominate, The volume of anaerobic zone(Vana):anoxie zone(Vano) : aerobic zone (Vaer) for the parallel AN/AO process is 1:1:1 in contrast with a Vana:Vaer and Vano:Vaer of 1:2 and 1:4 for a traditional biological nutrient removal process (BNR). Process 3 excels in the 3 processes on the basis of COD, TN and TP removal. For 4 month operation, the effluent COD concentration of process 3 did not exceed 60 mg/L; the effluent TN concentration of process 3 was lower than 15 mg/L; and the effluent TP concentration of process 3 was lower than 1 mg/L.
基金The 111 project(B07018) Supported by Program for Changjiang Scholars and Innovative Research Teamin University(IRT0423)
文摘To improve image processing speed and detection precision of a surface detection system on a strip surface,based on the analysis of the characteristics of image data and image processing in detection system on the strip surface,the design of parallel image processing system and the methods of algorithm implementation have been studied. By using field programmable gate array(FPGA) as hardware platform of implementation and considering the characteristic of detection system on the strip surface,a parallel image processing system implemented by using multi IP kernel is designed. According to different computing tasks and the load balancing capability of parallel processing system,the system could set different calculating numbers of nodes to meet the system's demand and save the hardware cost.