Parallel arrays with coprime subarrays have shown its potential advantages for two dimensional direction of arrival(DOA)estimation.In this paper,by introducing two flexible coprime factors to enlarge the inter-element...Parallel arrays with coprime subarrays have shown its potential advantages for two dimensional direction of arrival(DOA)estimation.In this paper,by introducing two flexible coprime factors to enlarge the inter-element spacing of parallel uniform subarrays,we propose a generalized parallel coprime array(GPCA)geometry.The proposed geometry enjoys flexible array layouts by the coprime factors and enables to extend the array aperture to achieve great improvement of estimation performance.Meanwhile,we verify that GPCA always can obtain M2 degrees of freedom(DOFs)in co-array domain via 2M sensors after optimization,which outperforms sparse parallel array geometries,such as parallel coprime array(PCA)and parallel augmented coprime array(PACA),and is the same as parallel nested array(PNA)with extended aperture.The superiority of GPCA geometry has been proved by numerical simulations with sparse representation methods.展开更多
We describe the structure and testing of one-dimensional array parallel-optics photo-detectors with 16 photodiodes of which each diode operates up to 8 Gb/s. The single element is vertical and top illuminated 30μm-di...We describe the structure and testing of one-dimensional array parallel-optics photo-detectors with 16 photodiodes of which each diode operates up to 8 Gb/s. The single element is vertical and top illuminated 30μm-diameter silicon on insulator (Ge-on-SOI) PIN photodetector. High-quality Ge absorption layer is epitaxially grown on SO1 substrate by the ultra-high vacuum chemical vapor deposition (UHV-CVD). The photodiode exhibits a good responsivity of 0.20 A/W at a wavelength of 1550 nm. The dark current is as low as 0.36/aA at a reverse bias of 1 V, and the corresponding current density is about 51 mA/cm2. The detector with a diameter of 30 t.trn is measured at an incident light of 1.55 μm and 0.5 mW, and the 3-dB bandwidth is 7.39 GHz without bias and 13.9 GHz at a reverse bias of 3 V. The 16 devices show a good consistency.展开更多
-The inline and lift forces on bipiles in parallel array induced by both irregular waves and currents were investigated experimentally in this paper. The characteristics in both time and frequency domains of inline, l...-The inline and lift forces on bipiles in parallel array induced by both irregular waves and currents were investigated experimentally in this paper. The characteristics in both time and frequency domains of inline, lift and resultant forces as well were analyzed. The grouping effect coefficients of inline and resultant forces on two piles related to KC number and relative spacing parameters are given. A comparison of the magnitude and direction of resultant forces on two piles in parallel array with the corresponding values for single cylinder is also made.展开更多
An efficient analyzing approach is presented for large slotted-waveguide antenna arrays by using hybrid finite element-boundary integral-multilevel fast multipole algorithm(FE-BI-MLFMA)in this paper.A simple computa...An efficient analyzing approach is presented for large slotted-waveguide antenna arrays by using hybrid finite element-boundary integral-multilevel fast multipole algorithm(FE-BI-MLFMA)in this paper.A simple computation model for slotted-waveguide antenna is presented by using thin current probe excitation and perfectly matched layer(PML)absorber.Since each slotted-waveguide antenna can be considered as a single sub-domain,the domain decomposition algorithm(DDA)can be applied to FE-BI-MLFMA to greatly reduce the computation resources and achieve high efficiency.This DDA-FE-BI-MLMFA is parallelized to further strength its capability.The comparisons of the computed radiation patterns with measured data and results from the commercial software show that our method has good accuracy for slottedwaveguide array.Then the influence of mutual coupling between adjacent slotted-waveguides is studied.To demonstrate capability of the presented method,a carefully designed large X-band slotted-waveguide antenna array containing eighteen waveguides with Taylor amplitude and inverse phase excitation distribution are analyzed in the paper.展开更多
The received signal of the polarization sensitive array is proved to have trilinear model characteristics. The blind parallel factor(PARAFAC) signal detection algorithm for the polarization sensitive array is propos...The received signal of the polarization sensitive array is proved to have trilinear model characteristics. The blind parallel factor(PARAFAC) signal detection algorithm for the polarization sensitive array is proposed. The trilinear alternating least square (TALS) algorithm is used to abtain the source matrix, and then the matrix is judged. Simulation results show that the bit error rate (BER) of the detection algorithm is close to that of the non-blind decorrelating method and the algorithm works well under the array error condition. BER difference between the non-blind method and this algorithm is less than 2 dB under a high SNR. The algorithm is blind and robust. The channel fading, the direction of arrive(DOA) imformation and the polarization information are needless in the algorithm.展开更多
Design and fabrication of a parallel optical transmitter are reported. The optimized 12 channel parallel optical transmitter,with each channel's data rate up to 3Gbit/s,is designed, assembled, and measured. A top-emi...Design and fabrication of a parallel optical transmitter are reported. The optimized 12 channel parallel optical transmitter,with each channel's data rate up to 3Gbit/s,is designed, assembled, and measured. A top-emitting 850nm vertical cavity surface emitting laser(VCSEL) array is adopted as the light source,and the VCSEL chip is directly wire bonded to a 12 channel driver IC. The outputs of the VCSEL array are directly butt coupled into a 12 channel fiber array. Small form factor pluggable (SFP) packaging technology is used in the module to support hot pluggable in application. The performance results of the module are demonstrated. At an operating current of 8mA, an eye diagram at 3Gbit/s is achieved with an optical output of more than 1mW.展开更多
A 30Gbit/s receptor module is developed with a CMOS integrated receiver chip(IC) and a GaAs-based 1 × 12 photo detector array of PIN-type. Parallel technology is adopted in this module to realize a high-speed r...A 30Gbit/s receptor module is developed with a CMOS integrated receiver chip(IC) and a GaAs-based 1 × 12 photo detector array of PIN-type. Parallel technology is adopted in this module to realize a high-speed receiver module with medium speed devices. A high-speed printed circuit board(PCB) is designed and produced. The IC chip and the PD array are packaged on the PCB by chip-on-board technology. Flip chip alignment is used for the PD array accurately assembled on the module so that a plug-type optical port is built. Test results show that the module can receive parallel signals at 30Gbit/s. The sensitivity of the module is - 13.6dBm for 10^-13 BER.展开更多
In tensor theory, the parallel factorization (PARAFAC)decomposition expresses a tensor as the sum of a set of rank-1tensors. By carrying out this numerical decomposition, mixedsources can be separated or unknown sys...In tensor theory, the parallel factorization (PARAFAC)decomposition expresses a tensor as the sum of a set of rank-1tensors. By carrying out this numerical decomposition, mixedsources can be separated or unknown system parameters can beidentified, which is the so-called blind source separation or blindidentification. In this paper we propose a numerical PARAFACdecomposition algorithm. Compared to traditional algorithms, wespeed up the decomposition in several aspects, i.e., search di-rection by extrapolation, suboptimal step size by Gauss-Newtonapproximation, and linear search by n steps. The algorithm is ap-plied to polarization sensitive array parameter estimation to showits usefulness. Simulations verify the correctness and performanceof the proposed numerical techniques.展开更多
In this paper we discuss a novel storage scheme for simultaneous memory access in parallel turbo decoder. The new scheme employs vertex coloring in graph theory. Compared to a similar method that also uses unnatural o...In this paper we discuss a novel storage scheme for simultaneous memory access in parallel turbo decoder. The new scheme employs vertex coloring in graph theory. Compared to a similar method that also uses unnatural order in storage, our scheme requires 25 more memory blocks but allows a simpler configuration for variable sizes of code lengths that can be implemented on-chip. Experiment shows that for a moderate to high decoding throughput (40-100 Mbps), the hardware cost is still affordable for 3GPP's (3rd generation partnership project) interleaver.展开更多
Compared with the traditional scanning confocal microscopy, the effect of various factors on characteristic in multi-beam parallel confocal system is discussed, the error factors in multi-beam parallel confocal system...Compared with the traditional scanning confocal microscopy, the effect of various factors on characteristic in multi-beam parallel confocal system is discussed, the error factors in multi-beam parallel confocal system are analyzed. The factors influencing the characteristics of the multi-beam parallel confocal system are discussed. The construction and working principle of the non-scanning 3D detecting system is introduced, and some experiment results prove the effect of various factors on the detecting system.展开更多
After the extension of depth modeling mode 4(DMM-4)in 3D high efficiency video coding(3D-HEVC),the computational complexity increases sharply,which causes the real-time performance of video coding to be impacted.To re...After the extension of depth modeling mode 4(DMM-4)in 3D high efficiency video coding(3D-HEVC),the computational complexity increases sharply,which causes the real-time performance of video coding to be impacted.To reduce the computational complexity of DMM-4,a simplified hardware-friendly contour prediction algorithm is proposed in this paper.Based on the similarity between texture and depth map,the proposed algorithm directly codes depth blocks to calculate edge regions to reduce the number of reference blocks.Through the verification of the test sequence on HTM16.1,the proposed algorithm coding time is reduced by 9.42%compared with the original algorithm.To avoid the time consuming of serial coding on HTM,a parallelization design of the proposed algorithm based on reconfigurable array processor(DPR-CODEC)is proposed.The parallelization design reduces the storage access time,configuration time and saves the storage cost.Verified with the Xilinx Virtex 6 FPGA,experimental results show that parallelization design is capable of processing HD 1080p at a speed above 30 frames per second.Compared with the related work,the scheme reduces the LUTs by 42.3%,the REG by 85.5%and the hardware resources by 66.7%.The data loading speedup ratio of parallel scheme can reach 3.4539.On average,the different sized templates serial/parallel speedup ratio of encoding time can reach 2.446.展开更多
The problem of a periodic array of parallel cracks in a homogeneous piezoelectric strip bonded to a functionally graded piezoelectric material is investigated for inhomogeneous continuum. It is assumed that the materi...The problem of a periodic array of parallel cracks in a homogeneous piezoelectric strip bonded to a functionally graded piezoelectric material is investigated for inhomogeneous continuum. It is assumed that the material inhomogeneity is represented as the spatial variation of the shear modulus in the form of an exponential function along the direction of cracks. The mixed boundary value problem is reduced to a singular integral equation by applying the Fourier transform, and the singular integral equation is solved numerically by using the Gauss-Chebyshev integration technique. Numerical results are obtained to illustrate the variations of the stress intensity factors as a function of the crack periodicity for different values of the material inhomogeneity.展开更多
High-speed real-time digital frequency analysis is one major field of Fast Fourier Transform(FFT)application,such as Synthetic Aperture Radar(SAR)processing and medical imaging.In SAR processing,the image size could b...High-speed real-time digital frequency analysis is one major field of Fast Fourier Transform(FFT)application,such as Synthetic Aperture Radar(SAR)processing and medical imaging.In SAR processing,the image size could be 4 k×4 k in normal and it has become larger over the years.In the view of real-time,extensibility and reusable characteristics,an Field Programmable Gate Array(FPGA)based multi-channel variable-length FFT architecture which adopts radix-2 butterfly algorithm is proposed in this paper.The hardware implementation of FFT is partially reconfigurable architecture.Firstly,the proposed architecture in the paper has flexibility in terms of chip area,speed,resource utilization and power consumption.Secondly,the proposed architecture combines serial and parallel methods in its butterfly computations.Furthermore,on system-level issue,the proposed architecture takes advantage of state processing in serial mode and data processing in parallel mode.In case of sufficient FPGA resources,state processing of serial mode mentioned above is converted to pipeline mode.State processing of pipeline mode achieves high throughput.展开更多
A low density parity check(LDPC)encoder with the codes of(8176,7154)and encoding rate of 7/8 under CCSDS standard for near space communication is designed.Based on LDPC encoding theory,the FPGA-based coding algorithm ...A low density parity check(LDPC)encoder with the codes of(8176,7154)and encoding rate of 7/8 under CCSDS standard for near space communication is designed.Based on LDPC encoding theory,the FPGA-based coding algorithm is designed.Based on the characteristics of LDPC generating matrix,the cyclic shift register is introduced as the core of the encoding circuit,and the shift-register-Adder-Accumulator(SRAA)structure is adopted to realize the fast calculation of matrix multiplication,so as to construct the encoding module with partial parallel encoding circuit as the core.In addition,the serial port input and output module,RAM storage module and control module are also designed,which together constitute the encoder system.The design scheme is implemented by FPGA hardware and verified by simulation and experiment.The results show that the test results of the designed LDPC encoder are consistent with the theoretical results.Therefore,the coding system is practical,and the design method is simple and efficient.展开更多
Solar photovoltaic(PV)systems have gained importance as a promising renewable energy source in recent years.PV arrays are prone to variable irradiance levels under partial shading conditions due to non-uniform shading...Solar photovoltaic(PV)systems have gained importance as a promising renewable energy source in recent years.PV arrays are prone to variable irradiance levels under partial shading conditions due to non-uniform shading.As a result,there will be a decrease in the amount of power produced and hotspots will occur.To overcome these issues,it is essential to select an appropriate PV material and a suitable array configuration.To obtain the maximum output power from a PV array under partial shading conditions,this paper suggests a novel triple-series–parallel ladder topology with monocrystalline PV material.Considering short and wide,long and wide,short and narrow,long and narrow,middle and diagonal shading situations,a 6×6-sized array has been considered and compared with other existing configurations such as total cross-tied,bridge-link,honeycomb,series–parallel and series–parallel cross-tied.The proposed configuration has an optimal number of cross ties to produce maximum power.It has 4 cross ties fewer than a honeycomb,3 cross ties fewer than a bridge link,16 cross ties fewer than a total cross-tied and 7 cross ties fewer than series–parallel cross-tied configurations.The proposed configuration has an improvement in power of 0.1%to 20%compared with other configurations under the considered shading scenarios.展开更多
A systolic array architecture computer (FXCQ) has been designed for signal processing. R can handle floating point data at very high speed. It is composed of 16 processing cells and a cache that are connected linearly...A systolic array architecture computer (FXCQ) has been designed for signal processing. R can handle floating point data at very high speed. It is composed of 16 processing cells and a cache that are connected linearly and form a ring structure. All processing cells are identical and programmable. Each processing cell has the peak performance of 20 million floating-point operations per second (20MFLOPS). The machine therefore has a peak performance of 320 M FLOPS. It is integrated as an attached processor into a host system through VME bus interface. Programs for FXCQ are written in a high-level language -B language, which is supported by a parallel optimizing compiler. This paper describes the architecture of FXCQ, B language and its compiler.展开更多
We present the numerical and experimental study on the coherent beam combining of fibre amplifiers by means of simulated annealing (SA) algorithm. The feasibility is validated by the Monte Carlo simulation of correc...We present the numerical and experimental study on the coherent beam combining of fibre amplifiers by means of simulated annealing (SA) algorithm. The feasibility is validated by the Monte Carlo simulation of correcting static phase distortion using SA algorithm. The performance of SA algorithm under time-varying phase noise is numerically studied by dynamic simulation. It is revealed that the influence of phase noise on the performance of SA algorithm gets stronger with an increase in amplitude or frequency of phase noise; and the laser array that contains more lasers will be more affected from phase noise. The performance of SA algorithm for coherent beam combining is also compared with a widely used stochastic optimization algorithm, i.e., the stochastic parallel gradient descent (SPGD) algorithm. In a proof-of-concept experiment we demonstrate the coherent beam combining of two 1083~nm fibre amplifiers with a total output power of 12~W and 93% combining efficiency. The contrast of the far-field coherently combined beam profiles is calculated to be as high as 95%.展开更多
A novel DSP to ASIC (Application Specific Integrated Circuit) architecture design methodology is presented in this paper for reducing power/area consumption. Traditional methods always focus on optimizing hardware str...A novel DSP to ASIC (Application Specific Integrated Circuit) architecture design methodology is presented in this paper for reducing power/area consumption. Traditional methods always focus on optimizing hardware structure or algorithm separately. The authors propose a new method called PRF (Paralleling Reducing Folding) framework to combine hardware optimization with algorithm simplification. In the first step, paralleling, unfolding technology is applied to divide one data path into several channels and expose the redundancy of the algorithm. In the second step, reducing, decoupling theory is used to reduce computational complexity. In the last step, folding, time multiplexing method is used to merge similar components. As an exoteric methodology framework, many optimization methods can be integrated into the PRF framework. To optimize a 3N taps FIR (Fincte Impact Response) and obtain a content result, PRF methodology framework is applied.展开更多
Space-time selective parallel interference cancellation(ST-SPIC) is a computationally effective approach combining multiuser detection (MUD) with antenna array technology for CDMA systems. The exploitation of signal r...Space-time selective parallel interference cancellation(ST-SPIC) is a computationally effective approach combining multiuser detection (MUD) with antenna array technology for CDMA systems. The exploitation of signal reliability is a key issue in ST-SPIC. In order to improve the reliability estimation, a pair of reliability thresholds are introduced. Then an improved selective interference cancellation algorithm is proposed to exploit the reliability accordingly. More practical space-time processing algorithms are also incorporated in the proposed ST-SPIC scheme to overcome the limitation caused by some idealised assumptions taken in the original ST-SPIC scheme. Numerical results show that the proposed ST-SPIC scheme outperforms its traditional counterpart in a CDMA microcell environment.展开更多
文摘Parallel arrays with coprime subarrays have shown its potential advantages for two dimensional direction of arrival(DOA)estimation.In this paper,by introducing two flexible coprime factors to enlarge the inter-element spacing of parallel uniform subarrays,we propose a generalized parallel coprime array(GPCA)geometry.The proposed geometry enjoys flexible array layouts by the coprime factors and enables to extend the array aperture to achieve great improvement of estimation performance.Meanwhile,we verify that GPCA always can obtain M2 degrees of freedom(DOFs)in co-array domain via 2M sensors after optimization,which outperforms sparse parallel array geometries,such as parallel coprime array(PCA)and parallel augmented coprime array(PACA),and is the same as parallel nested array(PNA)with extended aperture.The superiority of GPCA geometry has been proved by numerical simulations with sparse representation methods.
文摘We describe the structure and testing of one-dimensional array parallel-optics photo-detectors with 16 photodiodes of which each diode operates up to 8 Gb/s. The single element is vertical and top illuminated 30μm-diameter silicon on insulator (Ge-on-SOI) PIN photodetector. High-quality Ge absorption layer is epitaxially grown on SO1 substrate by the ultra-high vacuum chemical vapor deposition (UHV-CVD). The photodiode exhibits a good responsivity of 0.20 A/W at a wavelength of 1550 nm. The dark current is as low as 0.36/aA at a reverse bias of 1 V, and the corresponding current density is about 51 mA/cm2. The detector with a diameter of 30 t.trn is measured at an incident light of 1.55 μm and 0.5 mW, and the 3-dB bandwidth is 7.39 GHz without bias and 13.9 GHz at a reverse bias of 3 V. The 16 devices show a good consistency.
文摘-The inline and lift forces on bipiles in parallel array induced by both irregular waves and currents were investigated experimentally in this paper. The characteristics in both time and frequency domains of inline, lift and resultant forces as well were analyzed. The grouping effect coefficients of inline and resultant forces on two piles related to KC number and relative spacing parameters are given. A comparison of the magnitude and direction of resultant forces on two piles in parallel array with the corresponding values for single cylinder is also made.
基金Supported by the National Key Basic Research Program(973 Program)(2012CB720702,61320602)the 111 Project of China(B14010)the National Natural Science Foundation of China(61371002)
文摘An efficient analyzing approach is presented for large slotted-waveguide antenna arrays by using hybrid finite element-boundary integral-multilevel fast multipole algorithm(FE-BI-MLFMA)in this paper.A simple computation model for slotted-waveguide antenna is presented by using thin current probe excitation and perfectly matched layer(PML)absorber.Since each slotted-waveguide antenna can be considered as a single sub-domain,the domain decomposition algorithm(DDA)can be applied to FE-BI-MLFMA to greatly reduce the computation resources and achieve high efficiency.This DDA-FE-BI-MLMFA is parallelized to further strength its capability.The comparisons of the computed radiation patterns with measured data and results from the commercial software show that our method has good accuracy for slottedwaveguide array.Then the influence of mutual coupling between adjacent slotted-waveguides is studied.To demonstrate capability of the presented method,a carefully designed large X-band slotted-waveguide antenna array containing eighteen waveguides with Taylor amplitude and inverse phase excitation distribution are analyzed in the paper.
文摘The received signal of the polarization sensitive array is proved to have trilinear model characteristics. The blind parallel factor(PARAFAC) signal detection algorithm for the polarization sensitive array is proposed. The trilinear alternating least square (TALS) algorithm is used to abtain the source matrix, and then the matrix is judged. Simulation results show that the bit error rate (BER) of the detection algorithm is close to that of the non-blind decorrelating method and the algorithm works well under the array error condition. BER difference between the non-blind method and this algorithm is less than 2 dB under a high SNR. The algorithm is blind and robust. The channel fading, the direction of arrive(DOA) imformation and the polarization information are needless in the algorithm.
文摘Design and fabrication of a parallel optical transmitter are reported. The optimized 12 channel parallel optical transmitter,with each channel's data rate up to 3Gbit/s,is designed, assembled, and measured. A top-emitting 850nm vertical cavity surface emitting laser(VCSEL) array is adopted as the light source,and the VCSEL chip is directly wire bonded to a 12 channel driver IC. The outputs of the VCSEL array are directly butt coupled into a 12 channel fiber array. Small form factor pluggable (SFP) packaging technology is used in the module to support hot pluggable in application. The performance results of the module are demonstrated. At an operating current of 8mA, an eye diagram at 3Gbit/s is achieved with an optical output of more than 1mW.
文摘A 30Gbit/s receptor module is developed with a CMOS integrated receiver chip(IC) and a GaAs-based 1 × 12 photo detector array of PIN-type. Parallel technology is adopted in this module to realize a high-speed receiver module with medium speed devices. A high-speed printed circuit board(PCB) is designed and produced. The IC chip and the PD array are packaged on the PCB by chip-on-board technology. Flip chip alignment is used for the PD array accurately assembled on the module so that a plug-type optical port is built. Test results show that the module can receive parallel signals at 30Gbit/s. The sensitivity of the module is - 13.6dBm for 10^-13 BER.
基金supported by the National Natural Science Foundation of China(61571131)the Technology Innovation Fund of the 10th Research Institute of China Electronics Technology Group Corporation(H17038.1)
文摘In tensor theory, the parallel factorization (PARAFAC)decomposition expresses a tensor as the sum of a set of rank-1tensors. By carrying out this numerical decomposition, mixedsources can be separated or unknown system parameters can beidentified, which is the so-called blind source separation or blindidentification. In this paper we propose a numerical PARAFACdecomposition algorithm. Compared to traditional algorithms, wespeed up the decomposition in several aspects, i.e., search di-rection by extrapolation, suboptimal step size by Gauss-Newtonapproximation, and linear search by n steps. The algorithm is ap-plied to polarization sensitive array parameter estimation to showits usefulness. Simulations verify the correctness and performanceof the proposed numerical techniques.
基金supported by the National High-Technology Research and Development Program of China (Grant No.2003AA123310), and the National Natural Science Foundation of China (Grant Nos.60332030, 60572157)
文摘In this paper we discuss a novel storage scheme for simultaneous memory access in parallel turbo decoder. The new scheme employs vertex coloring in graph theory. Compared to a similar method that also uses unnatural order in storage, our scheme requires 25 more memory blocks but allows a simpler configuration for variable sizes of code lengths that can be implemented on-chip. Experiment shows that for a moderate to high decoding throughput (40-100 Mbps), the hardware cost is still affordable for 3GPP's (3rd generation partnership project) interleaver.
基金This project is supported by National Natural Science Foundation of China (No.50175024)Provincial Program for Young Teacher of Colleges and Universities of Anhui(No.2005jql019)Provincial Research Foundation of Key Laboratory of Anhui.
文摘Compared with the traditional scanning confocal microscopy, the effect of various factors on characteristic in multi-beam parallel confocal system is discussed, the error factors in multi-beam parallel confocal system are analyzed. The factors influencing the characteristics of the multi-beam parallel confocal system are discussed. The construction and working principle of the non-scanning 3D detecting system is introduced, and some experiment results prove the effect of various factors on the detecting system.
基金Supported by the National Natural Science Foundation of China(No.61834005,61772417,61802304,61602377,61874087,61634004)the Shaanxi Province Key R&D Plan(No.2020JM-525,2021GY-029,2021KW-16)。
文摘After the extension of depth modeling mode 4(DMM-4)in 3D high efficiency video coding(3D-HEVC),the computational complexity increases sharply,which causes the real-time performance of video coding to be impacted.To reduce the computational complexity of DMM-4,a simplified hardware-friendly contour prediction algorithm is proposed in this paper.Based on the similarity between texture and depth map,the proposed algorithm directly codes depth blocks to calculate edge regions to reduce the number of reference blocks.Through the verification of the test sequence on HTM16.1,the proposed algorithm coding time is reduced by 9.42%compared with the original algorithm.To avoid the time consuming of serial coding on HTM,a parallelization design of the proposed algorithm based on reconfigurable array processor(DPR-CODEC)is proposed.The parallelization design reduces the storage access time,configuration time and saves the storage cost.Verified with the Xilinx Virtex 6 FPGA,experimental results show that parallelization design is capable of processing HD 1080p at a speed above 30 frames per second.Compared with the related work,the scheme reduces the LUTs by 42.3%,the REG by 85.5%and the hardware resources by 66.7%.The data loading speedup ratio of parallel scheme can reach 3.4539.On average,the different sized templates serial/parallel speedup ratio of encoding time can reach 2.446.
基金Project supported by the National Natural Science Foundation of China(No.10661009)the Ningxia Natural Science Foundation(No.NZ0604).
文摘The problem of a periodic array of parallel cracks in a homogeneous piezoelectric strip bonded to a functionally graded piezoelectric material is investigated for inhomogeneous continuum. It is assumed that the material inhomogeneity is represented as the spatial variation of the shear modulus in the form of an exponential function along the direction of cracks. The mixed boundary value problem is reduced to a singular integral equation by applying the Fourier transform, and the singular integral equation is solved numerically by using the Gauss-Chebyshev integration technique. Numerical results are obtained to illustrate the variations of the stress intensity factors as a function of the crack periodicity for different values of the material inhomogeneity.
基金The work was supported by National Natural Science Foundation of China(61271149)and by Beijing Natural Science Foundation(4144093)
文摘High-speed real-time digital frequency analysis is one major field of Fast Fourier Transform(FFT)application,such as Synthetic Aperture Radar(SAR)processing and medical imaging.In SAR processing,the image size could be 4 k×4 k in normal and it has become larger over the years.In the view of real-time,extensibility and reusable characteristics,an Field Programmable Gate Array(FPGA)based multi-channel variable-length FFT architecture which adopts radix-2 butterfly algorithm is proposed in this paper.The hardware implementation of FFT is partially reconfigurable architecture.Firstly,the proposed architecture in the paper has flexibility in terms of chip area,speed,resource utilization and power consumption.Secondly,the proposed architecture combines serial and parallel methods in its butterfly computations.Furthermore,on system-level issue,the proposed architecture takes advantage of state processing in serial mode and data processing in parallel mode.In case of sufficient FPGA resources,state processing of serial mode mentioned above is converted to pipeline mode.State processing of pipeline mode achieves high throughput.
文摘A low density parity check(LDPC)encoder with the codes of(8176,7154)and encoding rate of 7/8 under CCSDS standard for near space communication is designed.Based on LDPC encoding theory,the FPGA-based coding algorithm is designed.Based on the characteristics of LDPC generating matrix,the cyclic shift register is introduced as the core of the encoding circuit,and the shift-register-Adder-Accumulator(SRAA)structure is adopted to realize the fast calculation of matrix multiplication,so as to construct the encoding module with partial parallel encoding circuit as the core.In addition,the serial port input and output module,RAM storage module and control module are also designed,which together constitute the encoder system.The design scheme is implemented by FPGA hardware and verified by simulation and experiment.The results show that the test results of the designed LDPC encoder are consistent with the theoretical results.Therefore,the coding system is practical,and the design method is simple and efficient.
文摘Solar photovoltaic(PV)systems have gained importance as a promising renewable energy source in recent years.PV arrays are prone to variable irradiance levels under partial shading conditions due to non-uniform shading.As a result,there will be a decrease in the amount of power produced and hotspots will occur.To overcome these issues,it is essential to select an appropriate PV material and a suitable array configuration.To obtain the maximum output power from a PV array under partial shading conditions,this paper suggests a novel triple-series–parallel ladder topology with monocrystalline PV material.Considering short and wide,long and wide,short and narrow,long and narrow,middle and diagonal shading situations,a 6×6-sized array has been considered and compared with other existing configurations such as total cross-tied,bridge-link,honeycomb,series–parallel and series–parallel cross-tied.The proposed configuration has an optimal number of cross ties to produce maximum power.It has 4 cross ties fewer than a honeycomb,3 cross ties fewer than a bridge link,16 cross ties fewer than a total cross-tied and 7 cross ties fewer than series–parallel cross-tied configurations.The proposed configuration has an improvement in power of 0.1%to 20%compared with other configurations under the considered shading scenarios.
文摘A systolic array architecture computer (FXCQ) has been designed for signal processing. R can handle floating point data at very high speed. It is composed of 16 processing cells and a cache that are connected linearly and form a ring structure. All processing cells are identical and programmable. Each processing cell has the peak performance of 20 million floating-point operations per second (20MFLOPS). The machine therefore has a peak performance of 320 M FLOPS. It is integrated as an attached processor into a host system through VME bus interface. Programs for FXCQ are written in a high-level language -B language, which is supported by a parallel optimizing compiler. This paper describes the architecture of FXCQ, B language and its compiler.
文摘We present the numerical and experimental study on the coherent beam combining of fibre amplifiers by means of simulated annealing (SA) algorithm. The feasibility is validated by the Monte Carlo simulation of correcting static phase distortion using SA algorithm. The performance of SA algorithm under time-varying phase noise is numerically studied by dynamic simulation. It is revealed that the influence of phase noise on the performance of SA algorithm gets stronger with an increase in amplitude or frequency of phase noise; and the laser array that contains more lasers will be more affected from phase noise. The performance of SA algorithm for coherent beam combining is also compared with a widely used stochastic optimization algorithm, i.e., the stochastic parallel gradient descent (SPGD) algorithm. In a proof-of-concept experiment we demonstrate the coherent beam combining of two 1083~nm fibre amplifiers with a total output power of 12~W and 93% combining efficiency. The contrast of the far-field coherently combined beam profiles is calculated to be as high as 95%.
文摘A novel DSP to ASIC (Application Specific Integrated Circuit) architecture design methodology is presented in this paper for reducing power/area consumption. Traditional methods always focus on optimizing hardware structure or algorithm separately. The authors propose a new method called PRF (Paralleling Reducing Folding) framework to combine hardware optimization with algorithm simplification. In the first step, paralleling, unfolding technology is applied to divide one data path into several channels and expose the redundancy of the algorithm. In the second step, reducing, decoupling theory is used to reduce computational complexity. In the last step, folding, time multiplexing method is used to merge similar components. As an exoteric methodology framework, many optimization methods can be integrated into the PRF framework. To optimize a 3N taps FIR (Fincte Impact Response) and obtain a content result, PRF methodology framework is applied.
文摘Space-time selective parallel interference cancellation(ST-SPIC) is a computationally effective approach combining multiuser detection (MUD) with antenna array technology for CDMA systems. The exploitation of signal reliability is a key issue in ST-SPIC. In order to improve the reliability estimation, a pair of reliability thresholds are introduced. Then an improved selective interference cancellation algorithm is proposed to exploit the reliability accordingly. More practical space-time processing algorithms are also incorporated in the proposed ST-SPIC scheme to overcome the limitation caused by some idealised assumptions taken in the original ST-SPIC scheme. Numerical results show that the proposed ST-SPIC scheme outperforms its traditional counterpart in a CDMA microcell environment.