A low density parity check(LDPC)encoder with the codes of(8176,7154)and encoding rate of 7/8 under CCSDS standard for near space communication is designed.Based on LDPC encoding theory,the FPGA-based coding algorithm ...A low density parity check(LDPC)encoder with the codes of(8176,7154)and encoding rate of 7/8 under CCSDS standard for near space communication is designed.Based on LDPC encoding theory,the FPGA-based coding algorithm is designed.Based on the characteristics of LDPC generating matrix,the cyclic shift register is introduced as the core of the encoding circuit,and the shift-register-Adder-Accumulator(SRAA)structure is adopted to realize the fast calculation of matrix multiplication,so as to construct the encoding module with partial parallel encoding circuit as the core.In addition,the serial port input and output module,RAM storage module and control module are also designed,which together constitute the encoder system.The design scheme is implemented by FPGA hardware and verified by simulation and experiment.The results show that the test results of the designed LDPC encoder are consistent with the theoretical results.Therefore,the coding system is practical,and the design method is simple and efficient.展开更多
This paper introduces a switched hyperchaotic system that changes its behavior randomly from one subsystem to another via two switch functions, and its characteristics of symmetry, dissipation, equilibrium, bifurcatio...This paper introduces a switched hyperchaotic system that changes its behavior randomly from one subsystem to another via two switch functions, and its characteristics of symmetry, dissipation, equilibrium, bifurcation diagram, basic dynamics have been analyzed. The hardware implementation of the system is based on Field Programmable Gate Array (FPGA). It is shown that the experimental results are identical with numerical simulations, and the chaotic trajectories are much more complex.展开更多
This paper proposes the fractal patterns classifier for multiple cardiac arrhythmias on field-programmable gate array (FPGA) device. Fractal dimension transformation (FDT) is employed to adjoin the fractal features of...This paper proposes the fractal patterns classifier for multiple cardiac arrhythmias on field-programmable gate array (FPGA) device. Fractal dimension transformation (FDT) is employed to adjoin the fractal features of QRS-complex, including the supraventricular ectopic beat, bundle branch ectopic beat, and ventricular ectopic beat. FDT with fractal dimension (FD) is addressed for constructing various symptomatic patterns, which can produce family functions and enhance features, making clear differences between normal and unhealthy subjects. The probabilistic neural network (PNN) is proposed for recognizing multiple cardiac arrhythmias. Numerical experiments verify the efficiency and higher accuracy with the software simulation in order to formulate the mathematical model logical circuits. FDT results in data self-similarity for the same arrhythmia category, the number of dataset requirement and PNN architecture can be reduced. Its simplified model can be easily embedded in the FPGA chip. The prototype classifier is tested using the MIT-BIH arrhythmia database, and the tests reveal its practicality for monitoring ECG signals.展开更多
Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for...Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for this process is to combine inertial navigation system sensor information with the global navigation satellite system(GNSS)signal.However,some factors can interfere with the GNSS signal,such as ionospheric scintillation,jamming,or spoofing.One alternative method to avoid using the GNSS signal is to apply an image processing approach by matching UAV images with georeferenced images.But a high effort is required for image edge extraction.Here a support vector regression(SVR)model is proposed to reduce this computational load and processing time.The dynamic partial reconfiguration(DPR)of part of the SVR datapath is implemented to accelerate the process,reduce the area,and analyze its granularity by increasing the grain size of the reconfigurable region.Results show that the implementation in hardware is 68 times faster than that in software.This architecture with DPR also facilitates the low power consumption of 4 mW,leading to a reduction of 57%than that without DPR.This is also the lowest power consumption in current machine learning hardware implementations.Besides,the circuitry area is 41 times smaller.SVR with Gaussian kernel shows a success rate of 99.18%and minimum square error of 0.0146 for testing with the planning trajectory.This system is useful for adaptive applications where the user/designer can modify/reconfigure the hardware layout during its application,thus contributing to lower power consumption,smaller hardware area,and shorter execution time.展开更多
After research on the motion estimation algorithm in video coding, a motion estimator algorithm with 1/4 pixel ac- curacy is implemented based on fie/d-programmable gate array (FPGA). The motion estimation algorithm...After research on the motion estimation algorithm in video coding, a motion estimator algorithm with 1/4 pixel ac- curacy is implemented based on fie/d-programmable gate array (FPGA). The motion estimation algorithm module is made up of the 1[4 pixel interpolation module with serial input and parallel output, the three step search module and the block match- ing module, which can use relatively less Wiener filters for interpolation operation. Experiment results show that the hard- ware design has less consumption of the logical resource, higher stability and lower power consumption.展开更多
As a core component in intelligent edge computing,deep neural networks(DNNs)will increasingly play a critically important role in addressing the intelligence-related issues in the industry domain,like smart factories ...As a core component in intelligent edge computing,deep neural networks(DNNs)will increasingly play a critically important role in addressing the intelligence-related issues in the industry domain,like smart factories and autonomous driving.Due to the requirement for a large amount of storage space and computing resources,DNNs are unfavorable for resource-constrained edge computing devices,especially for mobile terminals with scarce energy supply.Binarization of DNN has become a promising technology to achieve a high performance with low resource consumption in edge computing.Field-programmable gate array(FPGA)-based acceleration can further improve the computation efficiency to several times higher compared with the central processing unit(CPU)and graphics processing unit(GPU).This paper gives a brief overview of binary neural networks(BNNs)and the corresponding hardware accelerator designs on edge computing environments,and analyzes some significant studies in detail.The performances of some methods are evaluated through the experiment results,and the latest binarization technologies and hardware acceleration methods are tracked.We first give the background of designing BNNs and present the typical types of BNNs.The FPGA implementation technologies of BNNs are then reviewed.Detailed comparison with experimental evaluation on typical BNNs and their FPGA implementation is further conducted.Finally,certain interesting directions are also illustrated as future work.展开更多
Neuromorphic computing is considered to be the future of machine learning,and it provides a new way of cognitive computing.Inspired by the excellent performance of spiking neural networks(SNNs)on the fields of low-pow...Neuromorphic computing is considered to be the future of machine learning,and it provides a new way of cognitive computing.Inspired by the excellent performance of spiking neural networks(SNNs)on the fields of low-power consumption and parallel computing,many groups tried to simulate the SNN with the hardware platform.However,the efficiency of training SNNs with neuromorphic algorithms is not ideal enough.Facing this,Michael et al.proposed a method which can solve the problem with the help of DNN(deep neural network).With this method,we can easily convert a well-trained DNN into an SCNN(spiking convolutional neural network).So far,there is a little of work focusing on the hardware accelerating of SCNN.The motivation of this paper is to design an SNN processor to accelerate SNN inference for SNNs obtained by this DNN-to-SNN method.We propose SIES(Spiking Neural Network Inference Engine for SCNN Accelerating).It uses a systolic array to accomplish the task of membrane potential increments computation.It integrates an optional hardware module of max-pooling to reduce additional data moving between the host and the SIES.We also design a hardware data setup mechanism for the convolutional layer on the SIES with which we can minimize the time of input spikes preparing.We implement the SIES on FPGA XCVU440.The number of neurons it supports is up to 4000 while the synapses are 256000.The SIES can run with the working frequency of 200 MHz,and its peak performance is 1.5625 TOPS.展开更多
Based on the characteristics of field-programmable gate array(FPGA) such as multi-task,high-speed,parallelity,etc,a filtering algorithm is presented to filter the output signal of laser gyr o with high-speed and high ...Based on the characteristics of field-programmable gate array(FPGA) such as multi-task,high-speed,parallelity,etc,a filtering algorithm is presented to filter the output signal of laser gyr o with high-speed and high accuracy.The filter is composed of basic logic cell s,multipliers an d memory inside FPGA.By using an multiplication decomposition method and design ing reliable time-sequence,the filter is realized,which is easy to be transpl anted and of low cost.Furthermore,all the signal demodulation algorithms of t he laser gyr oscope can be integrated in only one FPGA,which reduces the cost and complexity of the system.展开更多
The security of cryptographic algorithms based on integer factorization and discrete logarithm will be threatened by quantum computers in future.Since December 2016,the National Institute of Standards and Technology(N...The security of cryptographic algorithms based on integer factorization and discrete logarithm will be threatened by quantum computers in future.Since December 2016,the National Institute of Standards and Technology(NIST)has begun to solicit post-quantum cryptographic(PQC)algorithms worldwide.CRYSTALS-Kyber was selected as the standard of PQC algorithm after 3 rounds of evaluation.Meanwhile considering the large resource consumption of current implementation,this paper presents a lightweight architecture for ASICs and its implementation on FPGAs for prototyping.In this implementation,a novel compact modular multiplication unit(MMU)and compression/decompression module is proposed to save hardware resources.We put forward a specially optimized schoolbook polynomial multiplication(SPM)instead of number theoretic transform(NTT)core for polynomial multiplication,which can reduce about 74%SLICE cost.We also use signed number representation to save memory resources.In addition,we optimize the hardware implementation of the Hash module,which cuts off about 48%of FF consumption by register reuse technology.Our design can be implemented on Kintex-7(XC7K325T-2FFG900I)FPGA for prototyping,which occupations of 4777/4993 LUTs,2661/2765 FFs,1395/1452 SLICEs,2.5/2.5 BRAMs,and 0/0 DSP respective of client/server side.The maximum clock frequency can reach at 244 MHz.As far as we know,our design consumes the least resources compared with other existing designs,which is very friendly to resource-constrained devices.展开更多
Piezoelectric resonators are widely used in frequency reference devices, mass sensors, resonant sensors(such as gyros and accelerometers), etc. Piezoelectric resonators usually work in a special resonant mode. Obtaini...Piezoelectric resonators are widely used in frequency reference devices, mass sensors, resonant sensors(such as gyros and accelerometers), etc. Piezoelectric resonators usually work in a special resonant mode. Obtaining working resonant mode with high quality is key to improve the performance of piezoelectric resonators. In this paper, the resonance characteristics of a rectangular lead zirconium titanate(PZT) piezoelectric resonator are studied. On the basis of the field-programmable gate array(FPGA) embedded system, direct digital synthesizer(DDS) and automatic gain controller(AGC) are used to generate the driving signals with precisely adjustable frequency and amplitude. The driving signals are used to excite the piezoelectric resonator to the working vibration mode. The influence of the connection of driving electrodes and voltage amplitude on the vibration of the resonator is studied. The quality factor and vibration linearity of the resonator are studied with various driving methods mentioned in this paper. The resonator reaches resonant mode at 330 kHz by different driving methods.The relationship between resonant amplitude and driving signal amplitude is linear. The quality factor reaches over 150 by different driving methods. The results provide a theoretical reference for the efficient excitation of the piezoelectric resonator.展开更多
As a coprocessor, field-programmable gate array (FPGA) is the hardware computing processor accelerating the computing capacity of coraputers. To efficiently manage the hardware free resources for the placing of task...As a coprocessor, field-programmable gate array (FPGA) is the hardware computing processor accelerating the computing capacity of coraputers. To efficiently manage the hardware free resources for the placing of tasks on FPGA and take full advantage of the partially reconfigurable units, good utilization of chip resources is an important and necessary work. In this paper, a new method is proposed to find the complete set of maximal free resource rectangles based on the cross point of edge lines of running tasks on FPGA area, and the prove process is provided to make sure the correctness of this method.展开更多
Emerging applications widely use field-programmable gate array(FPGA)prototypes as a tool to verify modern very-large-scale integration(VLSI)circuits,imposing many problems,including routing failure caused by the limit...Emerging applications widely use field-programmable gate array(FPGA)prototypes as a tool to verify modern very-large-scale integration(VLSI)circuits,imposing many problems,including routing failure caused by the limited number of connections among blocks of FPGAs therein.Such a shortage of connections can be alleviated through time-division multiplexing(TDM),by which multiple signals sharing an identical routing channel can be transmitted.In this context,the routing quality dominantly decides the performance of such systems,proposing the requirement of minimizing the signal delay between FPGA pairs.This paper proposes algorithms for the routing problem in a multi-FPGA system with TDM support,aiming to minimize the maximum TDM ratio.The algorithm consists of two major stages:(1)A method is proposed to set the weight of an edge according to how many times it is shared by the routing requirements and consequently to compute a set of approximate minimum Steiner trees.(2)A ratio assignment method based on the edge-demand framework is devised for assigning ratios to the edges respecting the TDM ratio constraints.Experiments were conducted against the public benchmarks to evaluate our proposed approach as compared with all published works,and the results manifest that our method achieves a better TDM ratio in comparison.展开更多
Multiuser detection can be described as a quadratic optimization problem with binary constraint. Many techniques are available to find approximate solution to this problem. These tech- niques can be characterized in t...Multiuser detection can be described as a quadratic optimization problem with binary constraint. Many techniques are available to find approximate solution to this problem. These tech- niques can be characterized in terms of complexity and detection performance. The "efficient frontier" of known techniques include the decision-feedback, branch-and-bound and probabilistic data association detectors. The presented iterative multiuser detection technique is based on joint deregularized and box-constrained so- lution to quadratic optimization with iterations similar to that used in the nonstationary Tikhonov iterated algorithm. The deregulari- zation maximizes the energy of the solution, this is opposite to the Tikhonov regularization where the energy is minimized. However, combined with box-constraints, the deregularization forces the solution to be close to the binary set. We further exploit the box- constrained dichotomous coordinate descent (DCD) algorithm and adapt it to the nonstationary iterative Tikhonov regularization to present an efficient detector. As a result, the worst-case and aver- age complexity are reduced down to K28 and K2~ floating point operation per second, respectively. The development improves the "efficient frontier" in multiuser detection, which is illustrated by simulation results. Finally, a field programmable gate array (FPGA) design of the detector is presented. The detection performance obtained from the fixed-point FPGA implementation shows a good match to the floating-point implementation.展开更多
In accordance with the application requirements of high definition(HD) video surveillance systems,a real-time 5/3 lifting wavelet HD-video de-noising system is proposed with frame rate conversion(FRC) based on a field...In accordance with the application requirements of high definition(HD) video surveillance systems,a real-time 5/3 lifting wavelet HD-video de-noising system is proposed with frame rate conversion(FRC) based on a field-programmable gate array(FPGA),which uses a 3-level pipeline paralleled 5/3 lifting wavelet transformation and reconstruction structure,as well as a fast BayesS hrink adaptive threshold filtering module.The proposed system demonstrates de-noising performance,while also balancing system resources and achieving real-time processing.The experiments show that the proposed system's maximum operating frequency(through logic synthesis and layout using Quartus 13.1 software) can reach 178 MHz,based on the Altera Company's Stratix III EP3SE80 series FPGA.The proposed system can also satisfy real-time de-noising requirements of 1920 × 1080 at60 fps HD-video sources,while also significantly improving the peak signal to noise rate of the denoising images.Compared with similar systems,the system has the advantages of high operating frequency,and the ability to support multiple source formats for real-time processing.展开更多
This is a paper about laser gyro sign a l processing circuit which is designed based on field-programmable gate array(FPGA) and digital signal processor(DSP).Through a pre-amplifier circuit,FPGA and DSP,a weak current...This is a paper about laser gyro sign a l processing circuit which is designed based on field-programmable gate array(FPGA) and digital signal processor(DSP).Through a pre-amplifier circuit,FPGA and DSP,a weak current signal is converted and transferred,then sent to the computer to display the final results.Through the laser gyro performance te sting,the obtained results coincide with those of the existing methods.Thus th e d esigned circuit realizes the function of laser gyro signal processing.展开更多
Chaotic systems are an effective tool for various applications, including information security and internet of things. Many chaotic systems may have the weaknesses of incomplete output distributions, discontinuous cha...Chaotic systems are an effective tool for various applications, including information security and internet of things. Many chaotic systems may have the weaknesses of incomplete output distributions, discontinuous chaotic regions, and simple chaotic behaviors.These may result in many negative influences in practical applications utilizing chaos. To deal with these issues, this study introduces a modular chaotification model(MCM) to increase the dynamic properties of current one-dimensional(1 D) chaotic maps. To exhibit the effect of the MCM, three 1 D chaotic maps are improved using the MCM as examples. Studies of the resulting properties show the robust and complex dynamics of these improved chaotic maps. Moreover, we implement these improved chaotic maps of MCM in a field-programmable gate array hardware platform and apply them to the application of PRNG. Performance analyses verify that these chaotic maps improved by the MCM have more complicated chaotic behaviors and wider chaotic ranges than the existing and several new chaotic maps.展开更多
In this era of pervasive computing, low-resource devices have been deployed in various fields. PRINCE is a lightweight block cipher designed for low latency, and is suitable for pervasive computing applications. In th...In this era of pervasive computing, low-resource devices have been deployed in various fields. PRINCE is a lightweight block cipher designed for low latency, and is suitable for pervasive computing applications. In this paper, we propose new circuit structures for PRINCE components by sharing and simplifying logic circuits, to achieve the goal of using a smaller number of logic gates to obtain the same result. Based on the new circuit structures of components and the best sharing among components,we propose three new hardware architectures for PRINCE. The architectures are simulated and synthesized on different programmable gate array devices. The results on Virtex-6 show that compared with existing architectures, the resource consumption of the unrolled, low-cost, and two-cycle architectures is reduced by 73, 119, and 380 slices, respectively. The low-cost architecture costs only 137 slices. The unrolled architecture costs 409 slices and has a throughput of 5.34 Gb/s. To our knowledge, for the hardware implementation of PRINCE, the new low-cost architecture sets new area records, and the new unrolled architecture sets new throughput records. Therefore, the newly proposed architectures are more resource-efficient and suitable for lightweight,latency-critical applications.展开更多
文摘A low density parity check(LDPC)encoder with the codes of(8176,7154)and encoding rate of 7/8 under CCSDS standard for near space communication is designed.Based on LDPC encoding theory,the FPGA-based coding algorithm is designed.Based on the characteristics of LDPC generating matrix,the cyclic shift register is introduced as the core of the encoding circuit,and the shift-register-Adder-Accumulator(SRAA)structure is adopted to realize the fast calculation of matrix multiplication,so as to construct the encoding module with partial parallel encoding circuit as the core.In addition,the serial port input and output module,RAM storage module and control module are also designed,which together constitute the encoder system.The design scheme is implemented by FPGA hardware and verified by simulation and experiment.The results show that the test results of the designed LDPC encoder are consistent with the theoretical results.Therefore,the coding system is practical,and the design method is simple and efficient.
文摘This paper introduces a switched hyperchaotic system that changes its behavior randomly from one subsystem to another via two switch functions, and its characteristics of symmetry, dissipation, equilibrium, bifurcation diagram, basic dynamics have been analyzed. The hardware implementation of the system is based on Field Programmable Gate Array (FPGA). It is shown that the experimental results are identical with numerical simulations, and the chaotic trajectories are much more complex.
文摘This paper proposes the fractal patterns classifier for multiple cardiac arrhythmias on field-programmable gate array (FPGA) device. Fractal dimension transformation (FDT) is employed to adjoin the fractal features of QRS-complex, including the supraventricular ectopic beat, bundle branch ectopic beat, and ventricular ectopic beat. FDT with fractal dimension (FD) is addressed for constructing various symptomatic patterns, which can produce family functions and enhance features, making clear differences between normal and unhealthy subjects. The probabilistic neural network (PNN) is proposed for recognizing multiple cardiac arrhythmias. Numerical experiments verify the efficiency and higher accuracy with the software simulation in order to formulate the mathematical model logical circuits. FDT results in data self-similarity for the same arrhythmia category, the number of dataset requirement and PNN architecture can be reduced. Its simplified model can be easily embedded in the FPGA chip. The prototype classifier is tested using the MIT-BIH arrhythmia database, and the tests reveal its practicality for monitoring ECG signals.
基金financially supported by the National Council for Scientific and Technological Development(CNPq,Brazil),Swedish-Brazilian Research and Innovation Centre(CISB),and Saab AB under Grant No.CNPq:200053/2022-1the National Council for Scientific and Technological Development(CNPq,Brazil)under Grants No.CNPq:312924/2017-8 and No.CNPq:314660/2020-8.
文摘Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for this process is to combine inertial navigation system sensor information with the global navigation satellite system(GNSS)signal.However,some factors can interfere with the GNSS signal,such as ionospheric scintillation,jamming,or spoofing.One alternative method to avoid using the GNSS signal is to apply an image processing approach by matching UAV images with georeferenced images.But a high effort is required for image edge extraction.Here a support vector regression(SVR)model is proposed to reduce this computational load and processing time.The dynamic partial reconfiguration(DPR)of part of the SVR datapath is implemented to accelerate the process,reduce the area,and analyze its granularity by increasing the grain size of the reconfigurable region.Results show that the implementation in hardware is 68 times faster than that in software.This architecture with DPR also facilitates the low power consumption of 4 mW,leading to a reduction of 57%than that without DPR.This is also the lowest power consumption in current machine learning hardware implementations.Besides,the circuitry area is 41 times smaller.SVR with Gaussian kernel shows a success rate of 99.18%and minimum square error of 0.0146 for testing with the planning trajectory.This system is useful for adaptive applications where the user/designer can modify/reconfigure the hardware layout during its application,thus contributing to lower power consumption,smaller hardware area,and shorter execution time.
基金The 8th Graduate Technology Fund of North University of China(No.201111)
文摘After research on the motion estimation algorithm in video coding, a motion estimator algorithm with 1/4 pixel ac- curacy is implemented based on fie/d-programmable gate array (FPGA). The motion estimation algorithm module is made up of the 1[4 pixel interpolation module with serial input and parallel output, the three step search module and the block match- ing module, which can use relatively less Wiener filters for interpolation operation. Experiment results show that the hard- ware design has less consumption of the logical resource, higher stability and lower power consumption.
基金supported by the Natural Science Foundation of Sichuan Province of China under Grant No.2022NSFSC0500the National Natural Science Foundation of China under Grant No.62072076.
文摘As a core component in intelligent edge computing,deep neural networks(DNNs)will increasingly play a critically important role in addressing the intelligence-related issues in the industry domain,like smart factories and autonomous driving.Due to the requirement for a large amount of storage space and computing resources,DNNs are unfavorable for resource-constrained edge computing devices,especially for mobile terminals with scarce energy supply.Binarization of DNN has become a promising technology to achieve a high performance with low resource consumption in edge computing.Field-programmable gate array(FPGA)-based acceleration can further improve the computation efficiency to several times higher compared with the central processing unit(CPU)and graphics processing unit(GPU).This paper gives a brief overview of binary neural networks(BNNs)and the corresponding hardware accelerator designs on edge computing environments,and analyzes some significant studies in detail.The performances of some methods are evaluated through the experiment results,and the latest binarization technologies and hardware acceleration methods are tracked.We first give the background of designing BNNs and present the typical types of BNNs.The FPGA implementation technologies of BNNs are then reviewed.Detailed comparison with experimental evaluation on typical BNNs and their FPGA implementation is further conducted.Finally,certain interesting directions are also illustrated as future work.
基金The work was supported by the HeGaoJi Program of China under Grant Nos.2017ZX01028103-002 and 2017ZX01038104-002the National Natural Science Foundation of China under Grant No.61472432.
文摘Neuromorphic computing is considered to be the future of machine learning,and it provides a new way of cognitive computing.Inspired by the excellent performance of spiking neural networks(SNNs)on the fields of low-power consumption and parallel computing,many groups tried to simulate the SNN with the hardware platform.However,the efficiency of training SNNs with neuromorphic algorithms is not ideal enough.Facing this,Michael et al.proposed a method which can solve the problem with the help of DNN(deep neural network).With this method,we can easily convert a well-trained DNN into an SCNN(spiking convolutional neural network).So far,there is a little of work focusing on the hardware accelerating of SCNN.The motivation of this paper is to design an SNN processor to accelerate SNN inference for SNNs obtained by this DNN-to-SNN method.We propose SIES(Spiking Neural Network Inference Engine for SCNN Accelerating).It uses a systolic array to accomplish the task of membrane potential increments computation.It integrates an optional hardware module of max-pooling to reduce additional data moving between the host and the SIES.We also design a hardware data setup mechanism for the convolutional layer on the SIES with which we can minimize the time of input spikes preparing.We implement the SIES on FPGA XCVU440.The number of neurons it supports is up to 4000 while the synapses are 256000.The SIES can run with the working frequency of 200 MHz,and its peak performance is 1.5625 TOPS.
文摘Based on the characteristics of field-programmable gate array(FPGA) such as multi-task,high-speed,parallelity,etc,a filtering algorithm is presented to filter the output signal of laser gyr o with high-speed and high accuracy.The filter is composed of basic logic cell s,multipliers an d memory inside FPGA.By using an multiplication decomposition method and design ing reliable time-sequence,the filter is realized,which is easy to be transpl anted and of low cost.Furthermore,all the signal demodulation algorithms of t he laser gyr oscope can be integrated in only one FPGA,which reduces the cost and complexity of the system.
基金supported in part by the Shaanxi Province Key R&D Program(2019ZDLGY12-09)in part by the Higher Education Discipline Innovation 111 project(B16037)+1 种基金in part by the Shaanxi innovation team project(2018TD-007)in part by the China National Natural Science Foundation(62102298).
文摘The security of cryptographic algorithms based on integer factorization and discrete logarithm will be threatened by quantum computers in future.Since December 2016,the National Institute of Standards and Technology(NIST)has begun to solicit post-quantum cryptographic(PQC)algorithms worldwide.CRYSTALS-Kyber was selected as the standard of PQC algorithm after 3 rounds of evaluation.Meanwhile considering the large resource consumption of current implementation,this paper presents a lightweight architecture for ASICs and its implementation on FPGAs for prototyping.In this implementation,a novel compact modular multiplication unit(MMU)and compression/decompression module is proposed to save hardware resources.We put forward a specially optimized schoolbook polynomial multiplication(SPM)instead of number theoretic transform(NTT)core for polynomial multiplication,which can reduce about 74%SLICE cost.We also use signed number representation to save memory resources.In addition,we optimize the hardware implementation of the Hash module,which cuts off about 48%of FF consumption by register reuse technology.Our design can be implemented on Kintex-7(XC7K325T-2FFG900I)FPGA for prototyping,which occupations of 4777/4993 LUTs,2661/2765 FFs,1395/1452 SLICEs,2.5/2.5 BRAMs,and 0/0 DSP respective of client/server side.The maximum clock frequency can reach at 244 MHz.As far as we know,our design consumes the least resources compared with other existing designs,which is very friendly to resource-constrained devices.
文摘Piezoelectric resonators are widely used in frequency reference devices, mass sensors, resonant sensors(such as gyros and accelerometers), etc. Piezoelectric resonators usually work in a special resonant mode. Obtaining working resonant mode with high quality is key to improve the performance of piezoelectric resonators. In this paper, the resonance characteristics of a rectangular lead zirconium titanate(PZT) piezoelectric resonator are studied. On the basis of the field-programmable gate array(FPGA) embedded system, direct digital synthesizer(DDS) and automatic gain controller(AGC) are used to generate the driving signals with precisely adjustable frequency and amplitude. The driving signals are used to excite the piezoelectric resonator to the working vibration mode. The influence of the connection of driving electrodes and voltage amplitude on the vibration of the resonator is studied. The quality factor and vibration linearity of the resonator are studied with various driving methods mentioned in this paper. The resonator reaches resonant mode at 330 kHz by different driving methods.The relationship between resonant amplitude and driving signal amplitude is linear. The quality factor reaches over 150 by different driving methods. The results provide a theoretical reference for the efficient excitation of the piezoelectric resonator.
基金Project supported by the Shanghai Leading Academic Discipline Project(Grant No.J50103)the Natural Science Foundation of Jiangxi Province(Grant No.2010GZS0031)the Science Technology Project of Jiangxi Province(Grant No.2010BGB00604)
文摘As a coprocessor, field-programmable gate array (FPGA) is the hardware computing processor accelerating the computing capacity of coraputers. To efficiently manage the hardware free resources for the placing of tasks on FPGA and take full advantage of the partially reconfigurable units, good utilization of chip resources is an important and necessary work. In this paper, a new method is proposed to find the complete set of maximal free resource rectangles based on the cross point of edge lines of running tasks on FPGA area, and the prove process is provided to make sure the correctness of this method.
基金supported by the Natural Science Foundation of Fujian Province(No.2020J01845)the National Natural Science Foundation of China(Nos.61772005 and 11871280)+1 种基金the Outstanding Youth Innovation Team Project for Universities of Shandong Province(No.2020KJN008)Qinglan Project.
文摘Emerging applications widely use field-programmable gate array(FPGA)prototypes as a tool to verify modern very-large-scale integration(VLSI)circuits,imposing many problems,including routing failure caused by the limited number of connections among blocks of FPGAs therein.Such a shortage of connections can be alleviated through time-division multiplexing(TDM),by which multiple signals sharing an identical routing channel can be transmitted.In this context,the routing quality dominantly decides the performance of such systems,proposing the requirement of minimizing the signal delay between FPGA pairs.This paper proposes algorithms for the routing problem in a multi-FPGA system with TDM support,aiming to minimize the maximum TDM ratio.The algorithm consists of two major stages:(1)A method is proposed to set the weight of an edge according to how many times it is shared by the routing requirements and consequently to compute a set of approximate minimum Steiner trees.(2)A ratio assignment method based on the edge-demand framework is devised for assigning ratios to the edges respecting the TDM ratio constraints.Experiments were conducted against the public benchmarks to evaluate our proposed approach as compared with all published works,and the results manifest that our method achieves a better TDM ratio in comparison.
基金supported by the National Council for Technological and Scientific Development of Brazil (RN82/2008)
文摘Multiuser detection can be described as a quadratic optimization problem with binary constraint. Many techniques are available to find approximate solution to this problem. These tech- niques can be characterized in terms of complexity and detection performance. The "efficient frontier" of known techniques include the decision-feedback, branch-and-bound and probabilistic data association detectors. The presented iterative multiuser detection technique is based on joint deregularized and box-constrained so- lution to quadratic optimization with iterations similar to that used in the nonstationary Tikhonov iterated algorithm. The deregulari- zation maximizes the energy of the solution, this is opposite to the Tikhonov regularization where the energy is minimized. However, combined with box-constraints, the deregularization forces the solution to be close to the binary set. We further exploit the box- constrained dichotomous coordinate descent (DCD) algorithm and adapt it to the nonstationary iterative Tikhonov regularization to present an efficient detector. As a result, the worst-case and aver- age complexity are reduced down to K28 and K2~ floating point operation per second, respectively. The development improves the "efficient frontier" in multiuser detection, which is illustrated by simulation results. Finally, a field programmable gate array (FPGA) design of the detector is presented. The detection performance obtained from the fixed-point FPGA implementation shows a good match to the floating-point implementation.
基金Supported by the Spark Program of China(No.2013GA780007)Key Scientific Research Project of Guandong Agriculture Industry Business Polytechnic(No.xyzd1604)
文摘In accordance with the application requirements of high definition(HD) video surveillance systems,a real-time 5/3 lifting wavelet HD-video de-noising system is proposed with frame rate conversion(FRC) based on a field-programmable gate array(FPGA),which uses a 3-level pipeline paralleled 5/3 lifting wavelet transformation and reconstruction structure,as well as a fast BayesS hrink adaptive threshold filtering module.The proposed system demonstrates de-noising performance,while also balancing system resources and achieving real-time processing.The experiments show that the proposed system's maximum operating frequency(through logic synthesis and layout using Quartus 13.1 software) can reach 178 MHz,based on the Altera Company's Stratix III EP3SE80 series FPGA.The proposed system can also satisfy real-time de-noising requirements of 1920 × 1080 at60 fps HD-video sources,while also significantly improving the peak signal to noise rate of the denoising images.Compared with similar systems,the system has the advantages of high operating frequency,and the ability to support multiple source formats for real-time processing.
文摘This is a paper about laser gyro sign a l processing circuit which is designed based on field-programmable gate array(FPGA) and digital signal processor(DSP).Through a pre-amplifier circuit,FPGA and DSP,a weak current signal is converted and transferred,then sent to the computer to display the final results.Through the laser gyro performance te sting,the obtained results coincide with those of the existing methods.Thus th e d esigned circuit realizes the function of laser gyro signal processing.
基金supported by the National Natural Science Foundation of China (Grant No. 62071142)the Natural Scientific Research Innovation Foundation in Harbin Institute of Technology (Grant No. HIT.NSRIF.2020077)。
文摘Chaotic systems are an effective tool for various applications, including information security and internet of things. Many chaotic systems may have the weaknesses of incomplete output distributions, discontinuous chaotic regions, and simple chaotic behaviors.These may result in many negative influences in practical applications utilizing chaos. To deal with these issues, this study introduces a modular chaotification model(MCM) to increase the dynamic properties of current one-dimensional(1 D) chaotic maps. To exhibit the effect of the MCM, three 1 D chaotic maps are improved using the MCM as examples. Studies of the resulting properties show the robust and complex dynamics of these improved chaotic maps. Moreover, we implement these improved chaotic maps of MCM in a field-programmable gate array hardware platform and apply them to the application of PRNG. Performance analyses verify that these chaotic maps improved by the MCM have more complicated chaotic behaviors and wider chaotic ranges than the existing and several new chaotic maps.
基金Project supported by the Scientific Research Fund of Hunan Provincial Education Department,China (Nos. 19A072 and 20C0268)the Science and Technology Innovation Program of Hunan Province,China (No. 2016TP1020)+2 种基金the Application-Oriented Special Disciplines,Double First-Class University Project of Hunan Province,China (No. Xiangjiaotong [2018] 469)the Scienceof Hengyang Normal University,China (No. 18D23)the Postgraduate Scientific Research Innovation Project of Hunan Province,China (No. CX20190980)。
文摘In this era of pervasive computing, low-resource devices have been deployed in various fields. PRINCE is a lightweight block cipher designed for low latency, and is suitable for pervasive computing applications. In this paper, we propose new circuit structures for PRINCE components by sharing and simplifying logic circuits, to achieve the goal of using a smaller number of logic gates to obtain the same result. Based on the new circuit structures of components and the best sharing among components,we propose three new hardware architectures for PRINCE. The architectures are simulated and synthesized on different programmable gate array devices. The results on Virtex-6 show that compared with existing architectures, the resource consumption of the unrolled, low-cost, and two-cycle architectures is reduced by 73, 119, and 380 slices, respectively. The low-cost architecture costs only 137 slices. The unrolled architecture costs 409 slices and has a throughput of 5.34 Gb/s. To our knowledge, for the hardware implementation of PRINCE, the new low-cost architecture sets new area records, and the new unrolled architecture sets new throughput records. Therefore, the newly proposed architectures are more resource-efficient and suitable for lightweight,latency-critical applications.