The parallel processing based on the free running model test was adopted to predict the interaction force coefficients (flow straightening coefficient and wake fraction) of ship maneuvering. And the multipopulation ...The parallel processing based on the free running model test was adopted to predict the interaction force coefficients (flow straightening coefficient and wake fraction) of ship maneuvering. And the multipopulation genetic algorithm (MPGA) based on real coding that can contemporarily process the data of free running model and simulation of ship maneuvering was applied to solve the problem. Accordingly the optimal individual was obtained using the method of genetic algorithm. The parallel processing of multiopulation solved the prematurity in the identification for single population, meanwhile, the parallel processing of the data of ship maneuvering (turning motion and zigzag motion) is an attempt to solve the coefficient drift problem. In order to validate the method, the interaction force coefficients were verified by the procedure and these coefficients measured were compared with those ones identified. The maximum error is less than 5%, and the identification is an effective method.展开更多
In order to improve femtosecond laser throughput,a parallel processing system consisting of liquid crystal on silicon(LCOS)device as spatial light modulator is put forward.A method is described for displaying Fourier ...In order to improve femtosecond laser throughput,a parallel processing system consisting of liquid crystal on silicon(LCOS)device as spatial light modulator is put forward.A method is described for displaying Fourier hologram on LCOS,and a high uniformity of several diffraction peaks in the computer reconstruction is achieved.Application of this method to the parallel femtosecond laser processing is also demonstrated,and two intersecting rings and three tangent rings are fabricated respectively by one time in the photoresist.展开更多
The paper presents the implementation of a parallel version of FDK (Felkamp, David e Kress) algorithm using graphics processing units. Discussion was briefly some elements the computed tomographic scan and FDK algor...The paper presents the implementation of a parallel version of FDK (Felkamp, David e Kress) algorithm using graphics processing units. Discussion was briefly some elements the computed tomographic scan and FDK algorithm; and some ideas about GPUs (Graphics Processing Units) and its use in general purpose computing were presented. The paper shows a computational implementation of FDK algorithm and the process of parallelization of this implementation. Compare the parallel version of the algorithm with the sequential version, used speedup as a performance metric. To evaluate the performance of parallel version, two GPUs, GeForce 9400GT (16 cores) a low capacity GPU and Quadro 2000 (192 cores) a medium capacity GPU was reached speedup of 3.37.展开更多
To improve image processing speed and detection precision of a surface detection system on a strip surface,based on the analysis of the characteristics of image data and image processing in detection system on the str...To improve image processing speed and detection precision of a surface detection system on a strip surface,based on the analysis of the characteristics of image data and image processing in detection system on the strip surface,the design of parallel image processing system and the methods of algorithm implementation have been studied. By using field programmable gate array(FPGA) as hardware platform of implementation and considering the characteristic of detection system on the strip surface,a parallel image processing system implemented by using multi IP kernel is designed. According to different computing tasks and the load balancing capability of parallel processing system,the system could set different calculating numbers of nodes to meet the system's demand and save the hardware cost.展开更多
A parallel arithmetic program for the molecular dynamics (MD) simulation study of a large sized system consisting of 50 000100 000 atoms of liquid metals is reformed, based on the cascade arithmetic program used for t...A parallel arithmetic program for the molecular dynamics (MD) simulation study of a large sized system consisting of 50 000100 000 atoms of liquid metals is reformed, based on the cascade arithmetic program used for the molecular dynamics simulation study of a small sized system consisting of 5001 000 atoms. The program is used to simulate the rapid solidification processes of liquid metal Al system. Some new results, such as larger clusters composed of more than 36 smaller clusters (icosahedra or defect icosahedra) obtained in the system of 50 000 atoms, however, the larger clusters can not be seen in the small sized system of 5001 000 atoms. On the other hand, the results from this simulation study would be more closed to the real situation of the system under consideration because the influence of boundary conditions is decreased remarkably. It can be expected that from the parallel algorithm combined with the higher performance super computer, the total number of atoms in simulation system can be enlarged again up to tens, even hundreds times in the near future.展开更多
In this paper, according to the parallel environment of ELXSI computer, a parallel solving process of substructure method in static and dynamic analyses of large-scale and complex structure has been put forward, and t...In this paper, according to the parallel environment of ELXSI computer, a parallel solving process of substructure method in static and dynamic analyses of large-scale and complex structure has been put forward, and the corresponding parallel computational program has been developed.展开更多
A sound speed profile plays an important role in shallow water sound propagation.Concurrent with in-situ measurements,many inversion methods,such as matched-field inversion,have been put forward to invert the sound sp...A sound speed profile plays an important role in shallow water sound propagation.Concurrent with in-situ measurements,many inversion methods,such as matched-field inversion,have been put forward to invert the sound speed profile from acoustic signals.However,the time cost of matched-field inversion may be very high in replica field calculations.We studied the feasibility and robustness of an acoustic tomography scheme with matched-field processing in shallow water,and described the sound speed profile by empirical orthogonal functions.We analyzed the acoustic signals from a vertical line array in ASIAEX2001 in the East China Sea to invert sound speed profiles with estimated empirical orthogonal functions and a parallel genetic algorithm to speed up the inversion.The results show that the inverted sound speed profiles are in good agreement with conductivity-temperature-depth measurements.Moreover,a posteriori probability analysis is carried out to verify the inversion results.展开更多
Genetic algorithm has been proposed to solve the problem of task assignment. However, it has some drawbacks, e.g., it often takes a long time to find an optimal solution, and the success rate is low. To overcome these...Genetic algorithm has been proposed to solve the problem of task assignment. However, it has some drawbacks, e.g., it often takes a long time to find an optimal solution, and the success rate is low. To overcome these problems, a new coarse grained parallel genetic algorithm with the scheme of central migration is presented, which exploits isolated sub populations. The new approach has been implemented in the PVM environment and has been evaluated on a workstation network for solving the task assignment problem. The results show that it not only significantly improves the result quality but also increases the speed for getting best solution.展开更多
In this paper a hybrid parallel multi-objective genetic algorithm is proposed for solving 0/1 knapsack problem. Multi-objective problems with non-convex and discrete Pareto front can take enormous computation time to ...In this paper a hybrid parallel multi-objective genetic algorithm is proposed for solving 0/1 knapsack problem. Multi-objective problems with non-convex and discrete Pareto front can take enormous computation time to converge to the true Pareto front. Hence, the classical multi-objective genetic algorithms (MOGAs) (i.e., non- Parallel MOGAs) may fail to solve such intractable problem in a reasonable amount of time. The proposed hybrid model will combine the best attribute of island and Jakobovic master slave models. We conduct an extensive experimental study in a multi-core system by varying the different size of processors and the result is compared with basic parallel model i.e., master-slave model which is used to parallelize NSGA-II. The experimental results confirm that the hybrid model is showing a clear edge over master-slave model in terms of processing time and approximation to the true Pareto front.展开更多
Large deformation contact problems generally involve highly nonlinear behaviors,which are very time-consuming and may lead to convergence issues.The finite particle method(FPM)effectively separates pure deformation fr...Large deformation contact problems generally involve highly nonlinear behaviors,which are very time-consuming and may lead to convergence issues.The finite particle method(FPM)effectively separates pure deformation from total motion in large deformation problems.In addition,the decoupled procedures of the FPM make it suitable for parallel computing,which may provide an approach to solve time-consuming issues.In this study,a graphics processing unit(GPU)-based parallel algorithm is proposed for two-dimensional large deformation contact problems.The fundamentals of the FPM for planar solids are first briefly introduced,including the equations of motion of particles and the internal forces of quadrilateral elements.Subsequently,a linked-list data structure suitable for parallel processing is built,and parallel global and local search algorithms are presented for contact detection.The contact forces are then derived and directly exerted on particles.The proposed method is implemented with main solution procedures executed in parallel on a GPU.Two verification problems comprising large deformation frictional contacts are presented,and the accuracy of the proposed algorithm is validated.Furthermore,the algorithm’s performance is investigated via a large-scale contact problem,and the maximum speedups of total computational time and contact calculation reach 28.5 and 77.4,respectively,relative to commercial finite element software Abaqus/Explicit running on a single-core central processing unit(CPU).The contact calculation time percentage of the total calculation time is only 18%with the FPM,much smaller than that(50%)with Abaqus/Explicit,demonstrating the efficiency of the proposed method.展开更多
Synthetic aperture radar can provide two dimension images by converting the acquired echoed SAR signal to target’s coordinate and reflectivity. With the advancement of sophisticated SAR signal processing, more and mo...Synthetic aperture radar can provide two dimension images by converting the acquired echoed SAR signal to target’s coordinate and reflectivity. With the advancement of sophisticated SAR signal processing, more and more SAR imaging methods have been proposed for synthetic aperture radar which works at near field and the Fresnel approximation is not appropriate. Time domain correlation is a kind of digital reconstruction method based on processing the synthetic aperture radar data in the two-dimensional frequency domain via Fourier transform. It reconstructs SAR image via simply correlation without any need for approximation or interpolation. But its high computational cost for correlation makes it unsuitable for real time imaging. In order to reduce the computational burden a modified algorithm about time domain correlation was given in this paper. It also can take full advantage of parallel computations of the imaging processor. Its practical implementation was proposed and the preliminary simulation results were presented. Simulation results show that the proposed algorithm is a computationally efficient way of implementing the reconstruction in real time SAR image processing.展开更多
This peper defines the communication-efficiency, which is directly related to the cost-efficiency, and Studies the relationship between the communication-efficiency and the processor-efficiency when they are applied t...This peper defines the communication-efficiency, which is directly related to the cost-efficiency, and Studies the relationship between the communication-efficiency and the processor-efficiency when they are applied to scalability analysis. An example of algorithms is given to analyze some typical architectures.展开更多
Parallel versions of prestack KirchhofT 3D integral migration algorithm, which is suitable forseismic data processing, are described in this paper. Firstly, the inherent parallel characteristics of seismicdata process...Parallel versions of prestack KirchhofT 3D integral migration algorithm, which is suitable forseismic data processing, are described in this paper. Firstly, the inherent parallel characteristics of seismicdata processing are analyzed. Then some principles in algorithm partition are discussed. Based on these analyses and the system architecture, communication mechanism, this algorithm is divided into four subtasksallocated to four nodes of 990 STAR-l. Then we describe in detail a module-partitioning method-theI / O processing and communication are separated from the computation process, the processes includingI / O processing and communication are allocated to transputer T805 and the other is allocated to processori860. These two processes are synchronized by shared memory and memory-lock mechanism, but the communication betWeen different nodes is implemented through links of transputer. Load balance among fourprocessor modules is performed dynamically. Finally, we discussed the speed--up of the parallel versions ofprestack KirchhofT 3D integral migration algorithm running on four nodes. Some further researches are also melltioned in this paper.展开更多
A Variable-driven model of AND-parallelism of logic programs isprcscntcd.It statically analyses the values of variables in clauses and picks out the varia.blcs contributing to the parallel execution and then generates...A Variable-driven model of AND-parallelism of logic programs isprcscntcd.It statically analyses the values of variables in clauses and picks out the varia.blcs contributing to the parallel execution and then generates the variable-driving graphsfor clauses.According to the variable-driving graph and the analysis of the instantiationsof variables at run,literals are driven to execute.With binding conflicts of shared variablesprevented,the variable-driven model fully develops the AND-parallelism.Based on thevariable-driving graph,somc models of AND-parallelism already put forward can beavailable if cquipcd with appropriate driving algorithms.展开更多
Three-dimensional(3D)image reconstruction involves the computations of an extensive amount of data that leads to tremendous processing time.Therefore,optimization is crucially needed to improve the performance and eff...Three-dimensional(3D)image reconstruction involves the computations of an extensive amount of data that leads to tremendous processing time.Therefore,optimization is crucially needed to improve the performance and efficiency.With the widespread use of graphics processing units(GPU),parallel computing is transforming this arduous reconstruction process for numerous imaging modalities,and photoacoustic computed tomography(PACT)is not an exception.Existing works have investigated GPU-based optimization on photoacoustic microscopy(PAM)and PACT reconstruction using compute unified device architecture(CUDA)on either C++or MATLAB only.However,our study is the first that uses cross-platform GPU computation.It maintains the simplicity of MATLAB,while improves the speed through CUDA/C++−based MATLAB converted functions called MEXCUDA.Compared to a purely MATLAB with GPU approach,our cross-platform method improves the speed five times.Because MATLAB is widely used in PAM and PACT,this study will open up new avenues for photoacoustic image reconstruction and relevant real-time imaging applications.展开更多
The paper designs a peripheral maximum gray differ-ence(PMGD)image segmentation method,a connected-compo-nent labeling(CCL)algorithm based on dynamic run length(DRL),and a real-time implementation streaming processor ...The paper designs a peripheral maximum gray differ-ence(PMGD)image segmentation method,a connected-compo-nent labeling(CCL)algorithm based on dynamic run length(DRL),and a real-time implementation streaming processor for DRL-CCL.And it verifies the function and performance in space target monitoring scene by the carrying experiment of Tianzhou-3 cargo spacecraft(TZ-3).The PMGD image segmentation method can segment the image into highly discrete and simple point tar-gets quickly,which reduces the generation of equivalences greatly and improves the real-time performance for DRL-CCL.Through parallel pipeline design,the storage of the streaming processor is optimized by 55%with no need for external me-mory,the logic is optimized by 60%,and the energy efficiency ratio is 12 times than that of the graphics processing unit,62 times than that of the digital signal proccessing,and 147 times than that of personal computers.Analyzing the results of 8756 images completed on-orbit,the speed is up to 5.88 FPS and the target detection rate is 100%.Our algorithm and implementation method meet the requirements of lightweight,high real-time,strong robustness,full-time,and stable operation in space irradia-tion environment.展开更多
基金the Knowledge-based Ship-designHyper-integrated Platform (KSHIP) of Ministry ofEducation, China
文摘The parallel processing based on the free running model test was adopted to predict the interaction force coefficients (flow straightening coefficient and wake fraction) of ship maneuvering. And the multipopulation genetic algorithm (MPGA) based on real coding that can contemporarily process the data of free running model and simulation of ship maneuvering was applied to solve the problem. Accordingly the optimal individual was obtained using the method of genetic algorithm. The parallel processing of multiopulation solved the prematurity in the identification for single population, meanwhile, the parallel processing of the data of ship maneuvering (turning motion and zigzag motion) is an attempt to solve the coefficient drift problem. In order to validate the method, the interaction force coefficients were verified by the procedure and these coefficients measured were compared with those ones identified. The maximum error is less than 5%, and the identification is an effective method.
基金National Natural Science Foundation of China(No.51275502)Natural Science Key Project of Anhui Province(No.KJ2011A014)+1 种基金China Postdoctoral Science Foundation funded project(NO.2012M511416)The Innovation Foundationof Anhui University and the Personnel Construction Project of Anhui University
文摘In order to improve femtosecond laser throughput,a parallel processing system consisting of liquid crystal on silicon(LCOS)device as spatial light modulator is put forward.A method is described for displaying Fourier hologram on LCOS,and a high uniformity of several diffraction peaks in the computer reconstruction is achieved.Application of this method to the parallel femtosecond laser processing is also demonstrated,and two intersecting rings and three tangent rings are fabricated respectively by one time in the photoresist.
文摘The paper presents the implementation of a parallel version of FDK (Felkamp, David e Kress) algorithm using graphics processing units. Discussion was briefly some elements the computed tomographic scan and FDK algorithm; and some ideas about GPUs (Graphics Processing Units) and its use in general purpose computing were presented. The paper shows a computational implementation of FDK algorithm and the process of parallelization of this implementation. Compare the parallel version of the algorithm with the sequential version, used speedup as a performance metric. To evaluate the performance of parallel version, two GPUs, GeForce 9400GT (16 cores) a low capacity GPU and Quadro 2000 (192 cores) a medium capacity GPU was reached speedup of 3.37.
基金The 111 project(B07018) Supported by Program for Changjiang Scholars and Innovative Research Teamin University(IRT0423)
文摘To improve image processing speed and detection precision of a surface detection system on a strip surface,based on the analysis of the characteristics of image data and image processing in detection system on the strip surface,the design of parallel image processing system and the methods of algorithm implementation have been studied. By using field programmable gate array(FPGA) as hardware platform of implementation and considering the characteristic of detection system on the strip surface,a parallel image processing system implemented by using multi IP kernel is designed. According to different computing tasks and the load balancing capability of parallel processing system,the system could set different calculating numbers of nodes to meet the system's demand and save the hardware cost.
文摘A parallel arithmetic program for the molecular dynamics (MD) simulation study of a large sized system consisting of 50 000100 000 atoms of liquid metals is reformed, based on the cascade arithmetic program used for the molecular dynamics simulation study of a small sized system consisting of 5001 000 atoms. The program is used to simulate the rapid solidification processes of liquid metal Al system. Some new results, such as larger clusters composed of more than 36 smaller clusters (icosahedra or defect icosahedra) obtained in the system of 50 000 atoms, however, the larger clusters can not be seen in the small sized system of 5001 000 atoms. On the other hand, the results from this simulation study would be more closed to the real situation of the system under consideration because the influence of boundary conditions is decreased remarkably. It can be expected that from the parallel algorithm combined with the higher performance super computer, the total number of atoms in simulation system can be enlarged again up to tens, even hundreds times in the near future.
文摘In this paper, according to the parallel environment of ELXSI computer, a parallel solving process of substructure method in static and dynamic analyses of large-scale and complex structure has been put forward, and the corresponding parallel computational program has been developed.
基金Supported by the Knowledge Innovation Program of the Chinese Academy of Sciences (No.KZCX1-YW-12-02)the National Natural Science Foundation of China (Nos.10974218,10734100)
文摘A sound speed profile plays an important role in shallow water sound propagation.Concurrent with in-situ measurements,many inversion methods,such as matched-field inversion,have been put forward to invert the sound speed profile from acoustic signals.However,the time cost of matched-field inversion may be very high in replica field calculations.We studied the feasibility and robustness of an acoustic tomography scheme with matched-field processing in shallow water,and described the sound speed profile by empirical orthogonal functions.We analyzed the acoustic signals from a vertical line array in ASIAEX2001 in the East China Sea to invert sound speed profiles with estimated empirical orthogonal functions and a parallel genetic algorithm to speed up the inversion.The results show that the inverted sound speed profiles are in good agreement with conductivity-temperature-depth measurements.Moreover,a posteriori probability analysis is carried out to verify the inversion results.
基金Supported by the Nation"86 3"Hi-Tech Development Program of China(86 3-30 6 -ZD11-0 1-8)
文摘Genetic algorithm has been proposed to solve the problem of task assignment. However, it has some drawbacks, e.g., it often takes a long time to find an optimal solution, and the success rate is low. To overcome these problems, a new coarse grained parallel genetic algorithm with the scheme of central migration is presented, which exploits isolated sub populations. The new approach has been implemented in the PVM environment and has been evaluated on a workstation network for solving the task assignment problem. The results show that it not only significantly improves the result quality but also increases the speed for getting best solution.
文摘In this paper a hybrid parallel multi-objective genetic algorithm is proposed for solving 0/1 knapsack problem. Multi-objective problems with non-convex and discrete Pareto front can take enormous computation time to converge to the true Pareto front. Hence, the classical multi-objective genetic algorithms (MOGAs) (i.e., non- Parallel MOGAs) may fail to solve such intractable problem in a reasonable amount of time. The proposed hybrid model will combine the best attribute of island and Jakobovic master slave models. We conduct an extensive experimental study in a multi-core system by varying the different size of processors and the result is compared with basic parallel model i.e., master-slave model which is used to parallelize NSGA-II. The experimental results confirm that the hybrid model is showing a clear edge over master-slave model in terms of processing time and approximation to the true Pareto front.
基金This work was supported by the National Key Research and Development Program of China[Grant No.2016YFC0800200]the National Natural Science Foundation of China[Grant Nos.51778568,51908492,and 52008366]+1 种基金Zhejiang Provincial Natural Science Foundation of China[Grant Nos.LQ21E080019 and LY21E080022]This work was also sup-ported by the Key Laboratory of Space Structures of Zhejiang Province(Zhejiang University)and the Center for Balance Architecture of Zhejiang University.
文摘Large deformation contact problems generally involve highly nonlinear behaviors,which are very time-consuming and may lead to convergence issues.The finite particle method(FPM)effectively separates pure deformation from total motion in large deformation problems.In addition,the decoupled procedures of the FPM make it suitable for parallel computing,which may provide an approach to solve time-consuming issues.In this study,a graphics processing unit(GPU)-based parallel algorithm is proposed for two-dimensional large deformation contact problems.The fundamentals of the FPM for planar solids are first briefly introduced,including the equations of motion of particles and the internal forces of quadrilateral elements.Subsequently,a linked-list data structure suitable for parallel processing is built,and parallel global and local search algorithms are presented for contact detection.The contact forces are then derived and directly exerted on particles.The proposed method is implemented with main solution procedures executed in parallel on a GPU.Two verification problems comprising large deformation frictional contacts are presented,and the accuracy of the proposed algorithm is validated.Furthermore,the algorithm’s performance is investigated via a large-scale contact problem,and the maximum speedups of total computational time and contact calculation reach 28.5 and 77.4,respectively,relative to commercial finite element software Abaqus/Explicit running on a single-core central processing unit(CPU).The contact calculation time percentage of the total calculation time is only 18%with the FPM,much smaller than that(50%)with Abaqus/Explicit,demonstrating the efficiency of the proposed method.
文摘Synthetic aperture radar can provide two dimension images by converting the acquired echoed SAR signal to target’s coordinate and reflectivity. With the advancement of sophisticated SAR signal processing, more and more SAR imaging methods have been proposed for synthetic aperture radar which works at near field and the Fresnel approximation is not appropriate. Time domain correlation is a kind of digital reconstruction method based on processing the synthetic aperture radar data in the two-dimensional frequency domain via Fourier transform. It reconstructs SAR image via simply correlation without any need for approximation or interpolation. But its high computational cost for correlation makes it unsuitable for real time imaging. In order to reduce the computational burden a modified algorithm about time domain correlation was given in this paper. It also can take full advantage of parallel computations of the imaging processor. Its practical implementation was proposed and the preliminary simulation results were presented. Simulation results show that the proposed algorithm is a computationally efficient way of implementing the reconstruction in real time SAR image processing.
文摘This peper defines the communication-efficiency, which is directly related to the cost-efficiency, and Studies the relationship between the communication-efficiency and the processor-efficiency when they are applied to scalability analysis. An example of algorithms is given to analyze some typical architectures.
文摘Parallel versions of prestack KirchhofT 3D integral migration algorithm, which is suitable forseismic data processing, are described in this paper. Firstly, the inherent parallel characteristics of seismicdata processing are analyzed. Then some principles in algorithm partition are discussed. Based on these analyses and the system architecture, communication mechanism, this algorithm is divided into four subtasksallocated to four nodes of 990 STAR-l. Then we describe in detail a module-partitioning method-theI / O processing and communication are separated from the computation process, the processes includingI / O processing and communication are allocated to transputer T805 and the other is allocated to processori860. These two processes are synchronized by shared memory and memory-lock mechanism, but the communication betWeen different nodes is implemented through links of transputer. Load balance among fourprocessor modules is performed dynamically. Finally, we discussed the speed--up of the parallel versions ofprestack KirchhofT 3D integral migration algorithm running on four nodes. Some further researches are also melltioned in this paper.
文摘A Variable-driven model of AND-parallelism of logic programs isprcscntcd.It statically analyses the values of variables in clauses and picks out the varia.blcs contributing to the parallel execution and then generates the variable-driving graphsfor clauses.According to the variable-driving graph and the analysis of the instantiationsof variables at run,literals are driven to execute.With binding conflicts of shared variablesprevented,the variable-driven model fully develops the AND-parallelism.Based on thevariable-driving graph,somc models of AND-parallelism already put forward can beavailable if cquipcd with appropriate driving algorithms.
基金supported in part by the Career Catalyst Research Grant from the Susan G.Komen Foundationthe Clinical and Translational Science Pilot Study Award from the National Institutes of Health.
文摘Three-dimensional(3D)image reconstruction involves the computations of an extensive amount of data that leads to tremendous processing time.Therefore,optimization is crucially needed to improve the performance and efficiency.With the widespread use of graphics processing units(GPU),parallel computing is transforming this arduous reconstruction process for numerous imaging modalities,and photoacoustic computed tomography(PACT)is not an exception.Existing works have investigated GPU-based optimization on photoacoustic microscopy(PAM)and PACT reconstruction using compute unified device architecture(CUDA)on either C++or MATLAB only.However,our study is the first that uses cross-platform GPU computation.It maintains the simplicity of MATLAB,while improves the speed through CUDA/C++−based MATLAB converted functions called MEXCUDA.Compared to a purely MATLAB with GPU approach,our cross-platform method improves the speed five times.Because MATLAB is widely used in PAM and PACT,this study will open up new avenues for photoacoustic image reconstruction and relevant real-time imaging applications.
文摘The paper designs a peripheral maximum gray differ-ence(PMGD)image segmentation method,a connected-compo-nent labeling(CCL)algorithm based on dynamic run length(DRL),and a real-time implementation streaming processor for DRL-CCL.And it verifies the function and performance in space target monitoring scene by the carrying experiment of Tianzhou-3 cargo spacecraft(TZ-3).The PMGD image segmentation method can segment the image into highly discrete and simple point tar-gets quickly,which reduces the generation of equivalences greatly and improves the real-time performance for DRL-CCL.Through parallel pipeline design,the storage of the streaming processor is optimized by 55%with no need for external me-mory,the logic is optimized by 60%,and the energy efficiency ratio is 12 times than that of the graphics processing unit,62 times than that of the digital signal proccessing,and 147 times than that of personal computers.Analyzing the results of 8756 images completed on-orbit,the speed is up to 5.88 FPS and the target detection rate is 100%.Our algorithm and implementation method meet the requirements of lightweight,high real-time,strong robustness,full-time,and stable operation in space irradia-tion environment.