In order to obtain variable characteristics,the digital filter's type,number of taps and coefficients should be changed constantly such that the desired frequency-domain characteristics can be obtained.This paper ...In order to obtain variable characteristics,the digital filter's type,number of taps and coefficients should be changed constantly such that the desired frequency-domain characteristics can be obtained.This paper proposes a method for self-programmable variable digital filter(VDF) design based on field programmable gate array(FPGA).We implement a digital filter system by using custom embedded micro-processor,programmable finite impulse response(P-FIR) macro module,coefficient-loader,clock manager and analog/digital(A/D) or digital/analog(D/A) controller and other modules.The self-programmable VDF can provide the best solution for realization of digital filter algorithms,which are the low-pass,high-pass,band-pass and band-stop filter algorithms with variable frequency domain characteristics.The design examples with minimum 1 to maximum 32 taps FIR filter,based on Modelsim post-routed simulation and onboard running on XUPV5-LX110T,are provided to demonstrate the effectiveness of the proposed method.展开更多
A high-performance digital servo system built on the platform of a field programmable gate array (FPGA),a fully digitized hardware design scheme of a direct torque control (DTC) and a low speed permanent magnet synchr...A high-performance digital servo system built on the platform of a field programmable gate array (FPGA),a fully digitized hardware design scheme of a direct torque control (DTC) and a low speed permanent magnet synchronous motor (PMSM) is proposed. The DTC strategy of PMSM is described with Verilog hardware description language and is employed on-chip FPGA in accordance with the electronic design automation design methodology. Due to large torque ripples in low speed PMSM,the hysteresis controller in a conventional PMSM DTC was replaced by a fuzzy controller. This FPGA scheme integrates the direct torque controller strategy,the time speed measurement algorithm,the fuzzy regulating technique and the space vector pulse width modulation principle. Experimental results indicate the fuzzy controller can provide a controllable speed at 20 r min-1 and torque at 330 N m with satisfactory dynamic and static performance. Furthermore,the results show that this new control strategy decreases the torque ripple drastically and enhances control performance.展开更多
In this paper,analyzed is the symbol synchronization algorithm in orthogonal frequency division multiplex(OFDM)system,and accomplished are the hardware circuit design of coarse and elaborate synchronization algorithms...In this paper,analyzed is the symbol synchronization algorithm in orthogonal frequency division multiplex(OFDM)system,and accomplished are the hardware circuit design of coarse and elaborate synchronization algorithms.Based on the analysis of coarse and elaborate synchronization algorithms,multiplexed are,the module accumulator,division and output judgement,which can evidently save the hardware resource cost.The analysis of circuit sequence and wave form simulation of the design scheme shows that the proposed method efficiently reduce system resources and power consumption.展开更多
This paper describes a maximum time difference pipelined arithmetic chip,the 36-bit adder and subtractor based on 1.5 μm CMOS gate array The chipcan operate at 60MHz, and consumes less than 0.5Wat. The results are al...This paper describes a maximum time difference pipelined arithmetic chip,the 36-bit adder and subtractor based on 1.5 μm CMOS gate array The chipcan operate at 60MHz, and consumes less than 0.5Wat. The results are alsostudied, and a more precise model of delay time dmerence is proposed.展开更多
Emerging applications widely use field-programmable gate array(FPGA)prototypes as a tool to verify modern very-large-scale integration(VLSI)circuits,imposing many problems,including routing failure caused by the limit...Emerging applications widely use field-programmable gate array(FPGA)prototypes as a tool to verify modern very-large-scale integration(VLSI)circuits,imposing many problems,including routing failure caused by the limited number of connections among blocks of FPGAs therein.Such a shortage of connections can be alleviated through time-division multiplexing(TDM),by which multiple signals sharing an identical routing channel can be transmitted.In this context,the routing quality dominantly decides the performance of such systems,proposing the requirement of minimizing the signal delay between FPGA pairs.This paper proposes algorithms for the routing problem in a multi-FPGA system with TDM support,aiming to minimize the maximum TDM ratio.The algorithm consists of two major stages:(1)A method is proposed to set the weight of an edge according to how many times it is shared by the routing requirements and consequently to compute a set of approximate minimum Steiner trees.(2)A ratio assignment method based on the edge-demand framework is devised for assigning ratios to the edges respecting the TDM ratio constraints.Experiments were conducted against the public benchmarks to evaluate our proposed approach as compared with all published works,and the results manifest that our method achieves a better TDM ratio in comparison.展开更多
High performance computer is often required by model predictive control(MPC) systems due to the heavy online computation burden.To extend MPC to more application cases with low-cost computation facilities, the impleme...High performance computer is often required by model predictive control(MPC) systems due to the heavy online computation burden.To extend MPC to more application cases with low-cost computation facilities, the implementation of MPC controller on field programmable gate array(FPGA) system is studied.For the dynamic matrix control(DMC) algorithm,the main design idea and the implemental strategy of DMC controller are introduced based on a FPGA’s embedded system.The performance tests show that both the computation efficiency and the accuracy of the proposed controller can be satisfied due to the parallel computing capability of FPGA.展开更多
In this paper, the feasibility of embedding the direct torque control (DTC) of an induction machine into field programmable gate arrays (FPGA) is investigated. DTC of an induction machine is simulated in a MATLAB/...In this paper, the feasibility of embedding the direct torque control (DTC) of an induction machine into field programmable gate arrays (FPGA) is investigated. DTC of an induction machine is simulated in a MATLAB/Simulink environment using a Xilinx system generator. The resulting design has a flexible and modular structure where the designer can customize the hardware blocks by changing the number of inputs, outputs, and algorithm when it is compared to the designs implemented using classical microcontrollers and digital signal processors. With its flexibility, other control algorithms can easily be programmed and embedded into the FPGA. The above system has been implemented on Xilinx Spartan 3A DSP FPGA controller. Simulation and experimentation have been performed to prove the validity of the proposed methodology.展开更多
We present a novel method to implement the radix-2 fast Fourier transform (FFT) algorithm on field programmable gate arrays (FPGA).The FFT architecture exploits parallelism by having more pipelined units in the stages...We present a novel method to implement the radix-2 fast Fourier transform (FFT) algorithm on field programmable gate arrays (FPGA).The FFT architecture exploits parallelism by having more pipelined units in the stages,and more parallel units within a stage.It has the noticeable advantages of high speed and more efficient resource utilization by employing four ganged butterfly engines (GBEs),and can be well matched to the placement of the resources on the FPGA.We adopt the decimation-infrequency (DIF) radix-2 FFT algorithm and implement the FFT processor on a state-of-the-art FPGA.Experimental results show that the processor can compute 1024-point complex radix-2 FFT in about 11 μs with a clock frequency of 200 MHz.展开更多
Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for...Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for this process is to combine inertial navigation system sensor information with the global navigation satellite system(GNSS)signal.However,some factors can interfere with the GNSS signal,such as ionospheric scintillation,jamming,or spoofing.One alternative method to avoid using the GNSS signal is to apply an image processing approach by matching UAV images with georeferenced images.But a high effort is required for image edge extraction.Here a support vector regression(SVR)model is proposed to reduce this computational load and processing time.The dynamic partial reconfiguration(DPR)of part of the SVR datapath is implemented to accelerate the process,reduce the area,and analyze its granularity by increasing the grain size of the reconfigurable region.Results show that the implementation in hardware is 68 times faster than that in software.This architecture with DPR also facilitates the low power consumption of 4 mW,leading to a reduction of 57%than that without DPR.This is also the lowest power consumption in current machine learning hardware implementations.Besides,the circuitry area is 41 times smaller.SVR with Gaussian kernel shows a success rate of 99.18%and minimum square error of 0.0146 for testing with the planning trajectory.This system is useful for adaptive applications where the user/designer can modify/reconfigure the hardware layout during its application,thus contributing to lower power consumption,smaller hardware area,and shorter execution time.展开更多
The drive towards shorter design cycles for analog integrated circuits has given impetus to the development of Field Programmable Analog Arrays(FPAAs),which are the analogue counterparts of Field Programmable Gate Arr...The drive towards shorter design cycles for analog integrated circuits has given impetus to the development of Field Programmable Analog Arrays(FPAAs),which are the analogue counterparts of Field Programmable Gate Arrays(FPGAs).In this paper,we present a new design methodology which using FPAA as a powerful analog front-end processing platform in the smart sensory microsystem.The proposed FPAA contains 16 homogeneous mixed-grained Configurable Analog Blocks(CABs) which house a variety of processing elements especially the proposed fine-grained Core Configurable Amplifiers(CCAs).The high flexible CABs allow the FPAA operating in both continuous-time and discrete-time approaches suitable to support variety of sensors.To reduce the nonideal parasitic effects and save area,the fat-tree interconnection network is adopted in this FPAA.The functionality of this FPAA is demonstrated through embedding of voltage and capacitive sensor signal readout circuits and a configurable band pass filter.The minimal detectable voltage and capacitor achieves 38 uV and 8.3 aF respectively within 100 Hz sensor bandwidth.The power consumption comparison of CCA in three applications shows that the FPAA has high power efficiency.And the simulation results also show that the FPAA has good tolerance with wide PVT variations.展开更多
With the development of silicon photomultiplier(SiPM)technology,front-end electronics for SiPM signal processing have been highly sought after in various fields.A compact 64-channel front-end electronics(FEE)system ac...With the development of silicon photomultiplier(SiPM)technology,front-end electronics for SiPM signal processing have been highly sought after in various fields.A compact 64-channel front-end electronics(FEE)system achieved by fieldprogrammable gate array-based charge-to-digital converter(FPGA-QDC)technology was built and developed.The FEE consists of an analog board and FPGA board.The analog board incorporates commercial amplifiers,resistors,and capacitors.The FPGA board is composed of a low-cost FPGA.The electronics performance of the FEE was evaluated in terms of noise,linearity,and uniformity.A positron emission tomography(PET)detector with three different readout configurations was designed to validate the readout capability of the FEE for SiPM-based detectors.The PET detector was made of a 15×15 lutetium–yttrium oxyorthosilicate(LYSO)crystal array directly coupled with a SiPM array detector.The experimental results show that FEE can process dual-polarity charge signals from the SiPM detectors.In addition,it shows a good energy resolution for 511-keV gamma photons under the dual-end readout for the LYSO crystal array irradiated by a Na-22 source.Overall,the FEE based on FPGA-QDC shows promise for application in SiPM-based radiation detectors.展开更多
As a core component in intelligent edge computing,deep neural networks(DNNs)will increasingly play a critically important role in addressing the intelligence-related issues in the industry domain,like smart factories ...As a core component in intelligent edge computing,deep neural networks(DNNs)will increasingly play a critically important role in addressing the intelligence-related issues in the industry domain,like smart factories and autonomous driving.Due to the requirement for a large amount of storage space and computing resources,DNNs are unfavorable for resource-constrained edge computing devices,especially for mobile terminals with scarce energy supply.Binarization of DNN has become a promising technology to achieve a high performance with low resource consumption in edge computing.Field-programmable gate array(FPGA)-based acceleration can further improve the computation efficiency to several times higher compared with the central processing unit(CPU)and graphics processing unit(GPU).This paper gives a brief overview of binary neural networks(BNNs)and the corresponding hardware accelerator designs on edge computing environments,and analyzes some significant studies in detail.The performances of some methods are evaluated through the experiment results,and the latest binarization technologies and hardware acceleration methods are tracked.We first give the background of designing BNNs and present the typical types of BNNs.The FPGA implementation technologies of BNNs are then reviewed.Detailed comparison with experimental evaluation on typical BNNs and their FPGA implementation is further conducted.Finally,certain interesting directions are also illustrated as future work.展开更多
In order to improve the bias stability of the micro-electro mechanical system(MEMS) gyroscope and reduce the impact on the bias from environmental temperature,a digital signal processing method is described for impr...In order to improve the bias stability of the micro-electro mechanical system(MEMS) gyroscope and reduce the impact on the bias from environmental temperature,a digital signal processing method is described for improving the accuracy of the drive phase in the gyroscope drive mode.Through the principle of bias signal generation,it can be concluded that the deviation of the drive phase is the main factor affecting the bias stability.To fulfill the purpose of precise drive phase control,a digital signal processing circuit based on the field-programmable gate array(FPGA) with the phase-lock closed-loop control method is described and a demodulation method for phase error suppression is given.Compared with the analog circuit,the bias drift is largely reduced in the new digital circuit and the bias stability is improved from 60 to 19 °/h.The new digital control method can greatly increase the drive phase accuracy,and thus improve the bias stability.展开更多
The design of an FPGA( field programmable gate array) based programmable SONET (synchronous optical network) OC-192 10 Gbit/s PRBS (pseudo-random binary sequence) generator and a bit interleaved polarity 8 (BI...The design of an FPGA( field programmable gate array) based programmable SONET (synchronous optical network) OC-192 10 Gbit/s PRBS (pseudo-random binary sequence) generator and a bit interleaved polarity 8 (BIP-8) error detector is presented. Implemented in a parallel feedback configuration, this tester features PRBS generation of sequences with bit lengths of 2^7 - 1,2^10- 1,2^15 - 1,2^23 - land 2^31 - 1 for up to 10 Gbit/s applications with a 10 Gbit/s optical transceiver, via the SFI-4 (OC-192 serdes-framer interface). In the OC-192 frame alignment circuit, a dichotomy search algorithm logic which performs the functions of word alignment and STM-64/OC192 de-frame speeds up the frame sync logic and reduces circuit complexity greatly. The system can be used as a low cost tester to evaluate the performance of OC-192 devices and components, taking the replacement of precious commercial PRBS testers.展开更多
The intellectual property (IP) core for inter-integrated circuit (IIC) bus controller is designed using finite state machine (FSM) based on field programmable gate array (FPGA). Not only the data from AT 24C02...The intellectual property (IP) core for inter-integrated circuit (IIC) bus controller is designed using finite state machine (FSM) based on field programmable gate array (FPGA). Not only the data from AT 24C02C can be read automatically after power on, but also the data from upper computer can be written into AT24C02C immediately under the control of the IIC bus controller. When it is applied to blast wave overpressure test system, the IIC bus controller can read and store working parameters automatically. In a laboratory environment, the IP core simulation is carried out and the result is accurate. In the explosion field test, by analyzing the obtained valid data, it can be concluded that the designed IP core has good reliability.展开更多
基金Science &Technology Plan Foundation of Hunan Province,China(No.2010F3102)Science Research Foundation of Hunan Province,China(No.08C392)
文摘In order to obtain variable characteristics,the digital filter's type,number of taps and coefficients should be changed constantly such that the desired frequency-domain characteristics can be obtained.This paper proposes a method for self-programmable variable digital filter(VDF) design based on field programmable gate array(FPGA).We implement a digital filter system by using custom embedded micro-processor,programmable finite impulse response(P-FIR) macro module,coefficient-loader,clock manager and analog/digital(A/D) or digital/analog(D/A) controller and other modules.The self-programmable VDF can provide the best solution for realization of digital filter algorithms,which are the low-pass,high-pass,band-pass and band-stop filter algorithms with variable frequency domain characteristics.The design examples with minimum 1 to maximum 32 taps FIR filter,based on Modelsim post-routed simulation and onboard running on XUPV5-LX110T,are provided to demonstrate the effectiveness of the proposed method.
基金the Natural Science Foundation of Hubei Province (No.2005ABA301)
文摘A high-performance digital servo system built on the platform of a field programmable gate array (FPGA),a fully digitized hardware design scheme of a direct torque control (DTC) and a low speed permanent magnet synchronous motor (PMSM) is proposed. The DTC strategy of PMSM is described with Verilog hardware description language and is employed on-chip FPGA in accordance with the electronic design automation design methodology. Due to large torque ripples in low speed PMSM,the hysteresis controller in a conventional PMSM DTC was replaced by a fuzzy controller. This FPGA scheme integrates the direct torque controller strategy,the time speed measurement algorithm,the fuzzy regulating technique and the space vector pulse width modulation principle. Experimental results indicate the fuzzy controller can provide a controllable speed at 20 r min-1 and torque at 330 N m with satisfactory dynamic and static performance. Furthermore,the results show that this new control strategy decreases the torque ripple drastically and enhances control performance.
基金Guangdong Province Science and Technology Guiding Project(2005B10101013)
文摘In this paper,analyzed is the symbol synchronization algorithm in orthogonal frequency division multiplex(OFDM)system,and accomplished are the hardware circuit design of coarse and elaborate synchronization algorithms.Based on the analysis of coarse and elaborate synchronization algorithms,multiplexed are,the module accumulator,division and output judgement,which can evidently save the hardware resource cost.The analysis of circuit sequence and wave form simulation of the design scheme shows that the proposed method efficiently reduce system resources and power consumption.
文摘This paper describes a maximum time difference pipelined arithmetic chip,the 36-bit adder and subtractor based on 1.5 μm CMOS gate array The chipcan operate at 60MHz, and consumes less than 0.5Wat. The results are alsostudied, and a more precise model of delay time dmerence is proposed.
基金supported by the Natural Science Foundation of Fujian Province(No.2020J01845)the National Natural Science Foundation of China(Nos.61772005 and 11871280)+1 种基金the Outstanding Youth Innovation Team Project for Universities of Shandong Province(No.2020KJN008)Qinglan Project.
文摘Emerging applications widely use field-programmable gate array(FPGA)prototypes as a tool to verify modern very-large-scale integration(VLSI)circuits,imposing many problems,including routing failure caused by the limited number of connections among blocks of FPGAs therein.Such a shortage of connections can be alleviated through time-division multiplexing(TDM),by which multiple signals sharing an identical routing channel can be transmitted.In this context,the routing quality dominantly decides the performance of such systems,proposing the requirement of minimizing the signal delay between FPGA pairs.This paper proposes algorithms for the routing problem in a multi-FPGA system with TDM support,aiming to minimize the maximum TDM ratio.The algorithm consists of two major stages:(1)A method is proposed to set the weight of an edge according to how many times it is shared by the routing requirements and consequently to compute a set of approximate minimum Steiner trees.(2)A ratio assignment method based on the edge-demand framework is devised for assigning ratios to the edges respecting the TDM ratio constraints.Experiments were conducted against the public benchmarks to evaluate our proposed approach as compared with all published works,and the results manifest that our method achieves a better TDM ratio in comparison.
基金the National Science Foundation of China(Nos.60934007 and 61074060)the Postdoctoral Science Foundation of China(No.20090460627)+2 种基金the Postdoctoral Scientific Program of Shanghai (No.10R21414600)the Specialized Research Fund for the Doctoral Program of Higher Education (No.20070248004)the China Postdoctoral Science Foundation Special Support(No.201003272)
文摘High performance computer is often required by model predictive control(MPC) systems due to the heavy online computation burden.To extend MPC to more application cases with low-cost computation facilities, the implementation of MPC controller on field programmable gate array(FPGA) system is studied.For the dynamic matrix control(DMC) algorithm,the main design idea and the implemental strategy of DMC controller are introduced based on a FPGA’s embedded system.The performance tests show that both the computation efficiency and the accuracy of the proposed controller can be satisfied due to the parallel computing capability of FPGA.
文摘In this paper, the feasibility of embedding the direct torque control (DTC) of an induction machine into field programmable gate arrays (FPGA) is investigated. DTC of an induction machine is simulated in a MATLAB/Simulink environment using a Xilinx system generator. The resulting design has a flexible and modular structure where the designer can customize the hardware blocks by changing the number of inputs, outputs, and algorithm when it is compared to the designs implemented using classical microcontrollers and digital signal processors. With its flexibility, other control algorithms can easily be programmed and embedded into the FPGA. The above system has been implemented on Xilinx Spartan 3A DSP FPGA controller. Simulation and experimentation have been performed to prove the validity of the proposed methodology.
文摘We present a novel method to implement the radix-2 fast Fourier transform (FFT) algorithm on field programmable gate arrays (FPGA).The FFT architecture exploits parallelism by having more pipelined units in the stages,and more parallel units within a stage.It has the noticeable advantages of high speed and more efficient resource utilization by employing four ganged butterfly engines (GBEs),and can be well matched to the placement of the resources on the FPGA.We adopt the decimation-infrequency (DIF) radix-2 FFT algorithm and implement the FFT processor on a state-of-the-art FPGA.Experimental results show that the processor can compute 1024-point complex radix-2 FFT in about 11 μs with a clock frequency of 200 MHz.
基金financially supported by the National Council for Scientific and Technological Development(CNPq,Brazil),Swedish-Brazilian Research and Innovation Centre(CISB),and Saab AB under Grant No.CNPq:200053/2022-1the National Council for Scientific and Technological Development(CNPq,Brazil)under Grants No.CNPq:312924/2017-8 and No.CNPq:314660/2020-8.
文摘Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for this process is to combine inertial navigation system sensor information with the global navigation satellite system(GNSS)signal.However,some factors can interfere with the GNSS signal,such as ionospheric scintillation,jamming,or spoofing.One alternative method to avoid using the GNSS signal is to apply an image processing approach by matching UAV images with georeferenced images.But a high effort is required for image edge extraction.Here a support vector regression(SVR)model is proposed to reduce this computational load and processing time.The dynamic partial reconfiguration(DPR)of part of the SVR datapath is implemented to accelerate the process,reduce the area,and analyze its granularity by increasing the grain size of the reconfigurable region.Results show that the implementation in hardware is 68 times faster than that in software.This architecture with DPR also facilitates the low power consumption of 4 mW,leading to a reduction of 57%than that without DPR.This is also the lowest power consumption in current machine learning hardware implementations.Besides,the circuitry area is 41 times smaller.SVR with Gaussian kernel shows a success rate of 99.18%and minimum square error of 0.0146 for testing with the planning trajectory.This system is useful for adaptive applications where the user/designer can modify/reconfigure the hardware layout during its application,thus contributing to lower power consumption,smaller hardware area,and shorter execution time.
基金Supported by the CAS/SAFEA International Partnership Program for Creative Research Teams,National High Technology Research and Develop Program of China(2012AA012301)National Science and Technology Major Project of China(2013ZX03006004)
文摘The drive towards shorter design cycles for analog integrated circuits has given impetus to the development of Field Programmable Analog Arrays(FPAAs),which are the analogue counterparts of Field Programmable Gate Arrays(FPGAs).In this paper,we present a new design methodology which using FPAA as a powerful analog front-end processing platform in the smart sensory microsystem.The proposed FPAA contains 16 homogeneous mixed-grained Configurable Analog Blocks(CABs) which house a variety of processing elements especially the proposed fine-grained Core Configurable Amplifiers(CCAs).The high flexible CABs allow the FPAA operating in both continuous-time and discrete-time approaches suitable to support variety of sensors.To reduce the nonideal parasitic effects and save area,the fat-tree interconnection network is adopted in this FPAA.The functionality of this FPAA is demonstrated through embedding of voltage and capacitive sensor signal readout circuits and a configurable band pass filter.The minimal detectable voltage and capacitor achieves 38 uV and 8.3 aF respectively within 100 Hz sensor bandwidth.The power consumption comparison of CCA in three applications shows that the FPAA has high power efficiency.And the simulation results also show that the FPAA has good tolerance with wide PVT variations.
基金supported by the Natural Science Foundation of Shandong Province (No. ZR2022QA039)the Program of Qilu Young Scholars of Shandong University
文摘With the development of silicon photomultiplier(SiPM)technology,front-end electronics for SiPM signal processing have been highly sought after in various fields.A compact 64-channel front-end electronics(FEE)system achieved by fieldprogrammable gate array-based charge-to-digital converter(FPGA-QDC)technology was built and developed.The FEE consists of an analog board and FPGA board.The analog board incorporates commercial amplifiers,resistors,and capacitors.The FPGA board is composed of a low-cost FPGA.The electronics performance of the FEE was evaluated in terms of noise,linearity,and uniformity.A positron emission tomography(PET)detector with three different readout configurations was designed to validate the readout capability of the FEE for SiPM-based detectors.The PET detector was made of a 15×15 lutetium–yttrium oxyorthosilicate(LYSO)crystal array directly coupled with a SiPM array detector.The experimental results show that FEE can process dual-polarity charge signals from the SiPM detectors.In addition,it shows a good energy resolution for 511-keV gamma photons under the dual-end readout for the LYSO crystal array irradiated by a Na-22 source.Overall,the FEE based on FPGA-QDC shows promise for application in SiPM-based radiation detectors.
基金supported by the Natural Science Foundation of Sichuan Province of China under Grant No.2022NSFSC0500the National Natural Science Foundation of China under Grant No.62072076.
文摘As a core component in intelligent edge computing,deep neural networks(DNNs)will increasingly play a critically important role in addressing the intelligence-related issues in the industry domain,like smart factories and autonomous driving.Due to the requirement for a large amount of storage space and computing resources,DNNs are unfavorable for resource-constrained edge computing devices,especially for mobile terminals with scarce energy supply.Binarization of DNN has become a promising technology to achieve a high performance with low resource consumption in edge computing.Field-programmable gate array(FPGA)-based acceleration can further improve the computation efficiency to several times higher compared with the central processing unit(CPU)and graphics processing unit(GPU).This paper gives a brief overview of binary neural networks(BNNs)and the corresponding hardware accelerator designs on edge computing environments,and analyzes some significant studies in detail.The performances of some methods are evaluated through the experiment results,and the latest binarization technologies and hardware acceleration methods are tracked.We first give the background of designing BNNs and present the typical types of BNNs.The FPGA implementation technologies of BNNs are then reviewed.Detailed comparison with experimental evaluation on typical BNNs and their FPGA implementation is further conducted.Finally,certain interesting directions are also illustrated as future work.
基金The National Natural Science Foundation of China (No.60974116)the Research Fund of Aeronautics Science (No. 20090869007)Specialized Research Fund for the Doctoral Program of Higher Education(No. 200802861063)
文摘In order to improve the bias stability of the micro-electro mechanical system(MEMS) gyroscope and reduce the impact on the bias from environmental temperature,a digital signal processing method is described for improving the accuracy of the drive phase in the gyroscope drive mode.Through the principle of bias signal generation,it can be concluded that the deviation of the drive phase is the main factor affecting the bias stability.To fulfill the purpose of precise drive phase control,a digital signal processing circuit based on the field-programmable gate array(FPGA) with the phase-lock closed-loop control method is described and a demodulation method for phase error suppression is given.Compared with the analog circuit,the bias drift is largely reduced in the new digital circuit and the bias stability is improved from 60 to 19 °/h.The new digital control method can greatly increase the drive phase accuracy,and thus improve the bias stability.
文摘The design of an FPGA( field programmable gate array) based programmable SONET (synchronous optical network) OC-192 10 Gbit/s PRBS (pseudo-random binary sequence) generator and a bit interleaved polarity 8 (BIP-8) error detector is presented. Implemented in a parallel feedback configuration, this tester features PRBS generation of sequences with bit lengths of 2^7 - 1,2^10- 1,2^15 - 1,2^23 - land 2^31 - 1 for up to 10 Gbit/s applications with a 10 Gbit/s optical transceiver, via the SFI-4 (OC-192 serdes-framer interface). In the OC-192 frame alignment circuit, a dichotomy search algorithm logic which performs the functions of word alignment and STM-64/OC192 de-frame speeds up the frame sync logic and reduces circuit complexity greatly. The system can be used as a low cost tester to evaluate the performance of OC-192 devices and components, taking the replacement of precious commercial PRBS testers.
文摘The intellectual property (IP) core for inter-integrated circuit (IIC) bus controller is designed using finite state machine (FSM) based on field programmable gate array (FPGA). Not only the data from AT 24C02C can be read automatically after power on, but also the data from upper computer can be written into AT24C02C immediately under the control of the IIC bus controller. When it is applied to blast wave overpressure test system, the IIC bus controller can read and store working parameters automatically. In a laboratory environment, the IP core simulation is carried out and the result is accurate. In the explosion field test, by analyzing the obtained valid data, it can be concluded that the designed IP core has good reliability.