In this paper,based on the field-programmable gate array(FPGA)xc5vlx220 of Xilinx Company,the FPGA verification method for application specific integrated circuit(ASIC)design is introduced.Firstly,the basic principles...In this paper,based on the field-programmable gate array(FPGA)xc5vlx220 of Xilinx Company,the FPGA verification method for application specific integrated circuit(ASIC)design is introduced.Firstly,the basic principles of FPGA verification are introduced.Then,the structure of the FPGA board and the verification methods are illustrated.Finally,the workflow of FPGA verification for audio video coding standard(AVS)decoder and the method of restoring images are introduced in detail.The FPGA resources occupancy is shown and analyzed.The result shows that FPGA can verify the ASIC rapidly and effectively so as to shorten the development cycle.展开更多
The paper takes a method of a low speed processer based on FPGA hardware accelerator SOC units to realize the MP3 player, and include some peripheral devices. The experimental results show that the system has implemen...The paper takes a method of a low speed processer based on FPGA hardware accelerator SOC units to realize the MP3 player, and include some peripheral devices. The experimental results show that the system has implemented the basic functions of the MP3 player, having its own advantages on increasing the decoding speed and reducing the system consumption. The system is convenient to redesign for more function in the future. In conclusion, it has a wide application prospect.展开更多
This paper proposes a parallel cyclic shift structure of address decoder to realize a high-throughput encoding and decoding method for irregular-quasi-cyclic low-density parity-check(IR-QC-LDPC)codes,with a dual-diago...This paper proposes a parallel cyclic shift structure of address decoder to realize a high-throughput encoding and decoding method for irregular-quasi-cyclic low-density parity-check(IR-QC-LDPC)codes,with a dual-diagonal parity structure.A normalized min-sum algorithm(NMSA)is employed for decoding.The whole verification of the encoding and decoding algorithm is simulated with Matlab,and the code rates of 5/6 and 2/3 are selected respectively for the initial bit error ratio as 6%and 1.04%.Based on the results of simulation,multi-code rates are compatible with different basis matrices.Then the simulated algorithms of encoder and decoder are migrated and implemented on the field programmable gate array(FPGA).The 183.36 Mbps throughput of encoder and the average 27.85 Mbps decoding throughput with the initial bit error ratio 6%are realized based on FPGA.展开更多
Considering that the hardware implementation of the normalized minimum sum(NMS)decoding algorithm for low-density parity-check(LDPC)code is difficult due to the uncertainty of scale factor,an NMS decoding algorithm wi...Considering that the hardware implementation of the normalized minimum sum(NMS)decoding algorithm for low-density parity-check(LDPC)code is difficult due to the uncertainty of scale factor,an NMS decoding algorithm with variable scale factor is proposed for the near-earth space LDPC codes(8177,7154)in the consultative committee for space data systems(CCSDS)standard.The shift characteristics of field programmable gate array(FPGA)is used to optimize the quantization data of check nodes,and finally the function of LDPC decoder is realized.The simulation and experimental results show that the designed FPGA-based LDPC decoder adopts the scaling factor in the NMS decoding algorithm to improve the decoding performance,simplify the hardware structure,accelerate the convergence speed and improve the error correction ability.展开更多
This paper presents a software turbo decoder on graphics processing units(GPU).Unlike previous works,the proposed decoding architecture for turbo codes mainly focuses on the Consultative Committee for Space Data Syste...This paper presents a software turbo decoder on graphics processing units(GPU).Unlike previous works,the proposed decoding architecture for turbo codes mainly focuses on the Consultative Committee for Space Data Systems(CCSDS)standard.However,the information frame lengths of the CCSDS turbo codes are not suitable for flexible sub-frame parallelism design.To mitigate this issue,we propose a padding method that inserts several bits before the information frame header.To obtain low-latency performance and high resource utilization,two-level intra-frame parallelisms and an efficient data structure are considered.The presented Max-Log-Map decoder can be adopted to decode the Long Term Evolution(LTE)turbo codes with only small modifications.The proposed CCSDS turbo decoder at 10 iterations on NVIDIA RTX3070 achieves about 150 Mbps and 50Mbps throughputs for the code rates 1/6 and 1/2,respectively.展开更多
In this paper,we innovatively associate the mutual information with the frame error rate(FER)performance and propose novel quantized decoders for polar codes.Based on the optimal quantizer of binary-input discrete mem...In this paper,we innovatively associate the mutual information with the frame error rate(FER)performance and propose novel quantized decoders for polar codes.Based on the optimal quantizer of binary-input discrete memoryless channels(BDMCs),the proposed decoders quantize the virtual subchannels of polar codes to maximize mutual information(MMI)between source bits and quantized symbols.The nested structure of polar codes ensures that the MMI quantization can be implemented stage by stage.Simulation results show that the proposed MMI decoders with 4 quantization bits outperform the existing nonuniform quantized decoders that minimize mean-squared error(MMSE)with 4 quantization bits,and yield even better performance than uniform MMI quantized decoders with 5 quantization bits.Furthermore,the proposed 5-bit quantized MMI decoders approach the floating-point decoders with negligible performance loss.展开更多
Electron density in fusion plasma is usually diagnosed using laser-aided interferometers. The phase difference signal obtained after phase demodulation is wrapped, which is also called a fringe jump. A method has been...Electron density in fusion plasma is usually diagnosed using laser-aided interferometers. The phase difference signal obtained after phase demodulation is wrapped, which is also called a fringe jump. A method has been developed to unwrap the phase difference signal in real time using FPGA, specifically designed to handle fringe jumps in the hydrogen cyanide(HCN) laser interferometer on the EAST superconducting tokamak. This method is designed for a phase demodulator using the fast Fourier transform(FFT) method at the front end. The method is better adapted for hardware implementation compared to complex mathematical analysis algorithms, such as field programmable gate array(FPGA). It has been applied to process the phase measurement results of the HCN laser interferometer on EAST in real time. Electron density results show good confidence in the fringe jump unwrapping method. Further possible application in other laser interferometers, such as the POlarimeter-INTerferometer(POINT)system on EAST tokamak is also discussed.展开更多
The globalization of hardware designs and supply chains,as well as the integration of third-party intellectual property(IP)cores,has led to an increased focus from malicious attackers on computing hardware.However,exi...The globalization of hardware designs and supply chains,as well as the integration of third-party intellectual property(IP)cores,has led to an increased focus from malicious attackers on computing hardware.However,existing defense or detection approaches often require additional circuitry to perform security verification,and are thus constrained by time and resource limitations.Considering the scale of actual engineering tasks and tight project schedules,it is usually difficult to implement designs for all modules in field programmable gate array(FPGA)circuits.Some studies have pointed out that the failure of key modules tends to cause greater damage to the network.Therefore,under limited conditions,priority protection designs need to be made on key modules to improve protection efficiency.We have conducted research on FPGA designs including single FPGA systems and multi-FPGA systems,to identify key modules in FPGA systems.For the single FPGA designs,considering the topological structure,network characteristics,and directionality of FPGA designs,we propose a node importance evaluationmethod based on the technique for order preference by similarity to an ideal solution(TOPSIS)method.Then,for the multi-FPGA designs,considering the influence of nodes in intra-layer and inter-layers,they are constructed into the interdependent network,and we propose a method based on connection strength to identify the important modules.Finally,we conduct empirical research using actual FPGA designs as examples.The results indicate that compared to other traditional indexes,node importance indexes proposed for different designs can better characterize the importance of nodes.展开更多
A passive neutron multiplicity measurement device,FH-NCM/S1,based on field-programmable gate arrays(FPGAs),is developed specifically for measuring the mass of plutonium-240(^(240)Pu)in mixed oxide fuel.FH-NCM/S1 adopt...A passive neutron multiplicity measurement device,FH-NCM/S1,based on field-programmable gate arrays(FPGAs),is developed specifically for measuring the mass of plutonium-240(^(240)Pu)in mixed oxide fuel.FH-NCM/S1 adopts an inte-grated approach,combining the shift register analysis mode with the pulse-position timestamp mode using an FPGA.The optimal effective length of the^(3)He neutron detector was determined to be 30 cm,and the thickness of the graphite reflector was ascertained to be 15 cm through MCNP simulations.After fabricating the device,calibration measurements were per-formed using a^(252)Cf neutron source;a detection efficiency of 43.07%and detector die-away time of 55.79μs were observed.Nine samples of plutonium oxide were measured under identical conditions using the FH-NCM/S1 in shift register analysis mode and a plutonium waste multiplicity counter.The obtained double rates underwent corrections for detection efficiency(ε)and double gate fraction(f_(d)),resulting in corrected double rates(D_(c)),which were used to validate the accuracy of the shift register analysis mode.Furthermore,the device exhibited fluctuations in the measurement results,and within a single 20 s measurement,these fluctuations remained below 10%.After 30 cycles,the relative error in the mass of^(240)Pu was less than 5%.Finally,correlation calculations confirmed the robust consistency of both measurement modes.This study holds specific significance for the subsequent design and development of neutron multiplicity devices.展开更多
Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for...Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for this process is to combine inertial navigation system sensor information with the global navigation satellite system(GNSS)signal.However,some factors can interfere with the GNSS signal,such as ionospheric scintillation,jamming,or spoofing.One alternative method to avoid using the GNSS signal is to apply an image processing approach by matching UAV images with georeferenced images.But a high effort is required for image edge extraction.Here a support vector regression(SVR)model is proposed to reduce this computational load and processing time.The dynamic partial reconfiguration(DPR)of part of the SVR datapath is implemented to accelerate the process,reduce the area,and analyze its granularity by increasing the grain size of the reconfigurable region.Results show that the implementation in hardware is 68 times faster than that in software.This architecture with DPR also facilitates the low power consumption of 4 mW,leading to a reduction of 57%than that without DPR.This is also the lowest power consumption in current machine learning hardware implementations.Besides,the circuitry area is 41 times smaller.SVR with Gaussian kernel shows a success rate of 99.18%and minimum square error of 0.0146 for testing with the planning trajectory.This system is useful for adaptive applications where the user/designer can modify/reconfigure the hardware layout during its application,thus contributing to lower power consumption,smaller hardware area,and shorter execution time.展开更多
In this paper, a modified FPGA scheme for the convolutional encoder and Viterbi decoder based on the IEEE 802.11a standards of WLAN is presented in OFDM baseband processing systems. The proposed design supports a gene...In this paper, a modified FPGA scheme for the convolutional encoder and Viterbi decoder based on the IEEE 802.11a standards of WLAN is presented in OFDM baseband processing systems. The proposed design supports a generic, robust and configurable Viterbi decoder with constraint length of 7, code rate of 1/2 and decoding depth of 36 symbols. The Viterbi decoder uses full-parallel structure to improve computational speed for the add-compare-select (ACS) modules, adopts optimal data storage mechanism to avoid overflow and employs three distributed RAM blocks to complete cyclic trace-back. It includes the core parts, for example, the state path measure computation, the preservation and transfer of the survivor path and trace-back decoding, etc. Compared to the general Viterbi decoder, this design can effectively decrease the 10% of chip logic elements, reduce 5% of power consumption, and increase the encoder and decoder working performance in the hardware implementation. Lastly, relevant simulation results using Verilog HDL language are verified based on a Xinlinx Virtex-II FPGA by ISE 7.1i. It is shown that the Viterbi decoder is capable of decoding (2, 1, 7) convolutional codes accurately with a throughput of 80 Mbps.展开更多
基金Science and Technology Key Project of Guangzhou(2007Z3-D3101)Production and Research Project of Zhuhai(PC20082002)Technology Innovation Project of Guangdong Province(2008778113)
文摘In this paper,based on the field-programmable gate array(FPGA)xc5vlx220 of Xilinx Company,the FPGA verification method for application specific integrated circuit(ASIC)design is introduced.Firstly,the basic principles of FPGA verification are introduced.Then,the structure of the FPGA board and the verification methods are illustrated.Finally,the workflow of FPGA verification for audio video coding standard(AVS)decoder and the method of restoring images are introduced in detail.The FPGA resources occupancy is shown and analyzed.The result shows that FPGA can verify the ASIC rapidly and effectively so as to shorten the development cycle.
文摘The paper takes a method of a low speed processer based on FPGA hardware accelerator SOC units to realize the MP3 player, and include some peripheral devices. The experimental results show that the system has implemented the basic functions of the MP3 player, having its own advantages on increasing the decoding speed and reducing the system consumption. The system is convenient to redesign for more function in the future. In conclusion, it has a wide application prospect.
基金supported by the National Natural Science Foundation of China(11705191)the Anhui Provincial Natural Science Foundation(1808085QF180)the Natural Science Foundation of Shanghai(18ZR1443600)
文摘This paper proposes a parallel cyclic shift structure of address decoder to realize a high-throughput encoding and decoding method for irregular-quasi-cyclic low-density parity-check(IR-QC-LDPC)codes,with a dual-diagonal parity structure.A normalized min-sum algorithm(NMSA)is employed for decoding.The whole verification of the encoding and decoding algorithm is simulated with Matlab,and the code rates of 5/6 and 2/3 are selected respectively for the initial bit error ratio as 6%and 1.04%.Based on the results of simulation,multi-code rates are compatible with different basis matrices.Then the simulated algorithms of encoder and decoder are migrated and implemented on the field programmable gate array(FPGA).The 183.36 Mbps throughput of encoder and the average 27.85 Mbps decoding throughput with the initial bit error ratio 6%are realized based on FPGA.
文摘Considering that the hardware implementation of the normalized minimum sum(NMS)decoding algorithm for low-density parity-check(LDPC)code is difficult due to the uncertainty of scale factor,an NMS decoding algorithm with variable scale factor is proposed for the near-earth space LDPC codes(8177,7154)in the consultative committee for space data systems(CCSDS)standard.The shift characteristics of field programmable gate array(FPGA)is used to optimize the quantization data of check nodes,and finally the function of LDPC decoder is realized.The simulation and experimental results show that the designed FPGA-based LDPC decoder adopts the scaling factor in the NMS decoding algorithm to improve the decoding performance,simplify the hardware structure,accelerate the convergence speed and improve the error correction ability.
基金supported by the Fundamental Research Funds for the Central Universities(FRF-TP20-062A1)Guangdong Basic and Applied Basic Research Foundation(2021A1515110070)。
文摘This paper presents a software turbo decoder on graphics processing units(GPU).Unlike previous works,the proposed decoding architecture for turbo codes mainly focuses on the Consultative Committee for Space Data Systems(CCSDS)standard.However,the information frame lengths of the CCSDS turbo codes are not suitable for flexible sub-frame parallelism design.To mitigate this issue,we propose a padding method that inserts several bits before the information frame header.To obtain low-latency performance and high resource utilization,two-level intra-frame parallelisms and an efficient data structure are considered.The presented Max-Log-Map decoder can be adopted to decode the Long Term Evolution(LTE)turbo codes with only small modifications.The proposed CCSDS turbo decoder at 10 iterations on NVIDIA RTX3070 achieves about 150 Mbps and 50Mbps throughputs for the code rates 1/6 and 1/2,respectively.
基金financially supported in part by National Key R&D Program of China(No.2018YFB1801402)in part by Huawei Technologies Co.,Ltd.
文摘In this paper,we innovatively associate the mutual information with the frame error rate(FER)performance and propose novel quantized decoders for polar codes.Based on the optimal quantizer of binary-input discrete memoryless channels(BDMCs),the proposed decoders quantize the virtual subchannels of polar codes to maximize mutual information(MMI)between source bits and quantized symbols.The nested structure of polar codes ensures that the MMI quantization can be implemented stage by stage.Simulation results show that the proposed MMI decoders with 4 quantization bits outperform the existing nonuniform quantized decoders that minimize mean-squared error(MMSE)with 4 quantization bits,and yield even better performance than uniform MMI quantized decoders with 5 quantization bits.Furthermore,the proposed 5-bit quantized MMI decoders approach the floating-point decoders with negligible performance loss.
基金funded and supported by the Comprehensive Research Facility for Fusion Technology Program of China(No.2018-000052-73-01-001228)the HFIPS Director’s Fund(No.YZJJKX202301)+1 种基金Anhui Provincial Major Science and Technology Project(No.2023z020004)Task JB22001 from the Anhui Provincial Department of Economic and Information Technology。
文摘Electron density in fusion plasma is usually diagnosed using laser-aided interferometers. The phase difference signal obtained after phase demodulation is wrapped, which is also called a fringe jump. A method has been developed to unwrap the phase difference signal in real time using FPGA, specifically designed to handle fringe jumps in the hydrogen cyanide(HCN) laser interferometer on the EAST superconducting tokamak. This method is designed for a phase demodulator using the fast Fourier transform(FFT) method at the front end. The method is better adapted for hardware implementation compared to complex mathematical analysis algorithms, such as field programmable gate array(FPGA). It has been applied to process the phase measurement results of the HCN laser interferometer on EAST in real time. Electron density results show good confidence in the fringe jump unwrapping method. Further possible application in other laser interferometers, such as the POlarimeter-INTerferometer(POINT)system on EAST tokamak is also discussed.
基金supported by the Natural Science Foundation of China under Grant Nos.62362008,61973163,61972345,U1911401.
文摘The globalization of hardware designs and supply chains,as well as the integration of third-party intellectual property(IP)cores,has led to an increased focus from malicious attackers on computing hardware.However,existing defense or detection approaches often require additional circuitry to perform security verification,and are thus constrained by time and resource limitations.Considering the scale of actual engineering tasks and tight project schedules,it is usually difficult to implement designs for all modules in field programmable gate array(FPGA)circuits.Some studies have pointed out that the failure of key modules tends to cause greater damage to the network.Therefore,under limited conditions,priority protection designs need to be made on key modules to improve protection efficiency.We have conducted research on FPGA designs including single FPGA systems and multi-FPGA systems,to identify key modules in FPGA systems.For the single FPGA designs,considering the topological structure,network characteristics,and directionality of FPGA designs,we propose a node importance evaluationmethod based on the technique for order preference by similarity to an ideal solution(TOPSIS)method.Then,for the multi-FPGA designs,considering the influence of nodes in intra-layer and inter-layers,they are constructed into the interdependent network,and we propose a method based on connection strength to identify the important modules.Finally,we conduct empirical research using actual FPGA designs as examples.The results indicate that compared to other traditional indexes,node importance indexes proposed for different designs can better characterize the importance of nodes.
基金supported by the National Natural Science Foundation of China(No.42374226)Natural Science Foundation of Jiangxi Province(Nos.20232BAB201043 and 20232BCJ23006)+1 种基金a sub-project of the nuclear energy development project of the China National Defense Science and Industry Bureau‘n-γfusion logging method theory research’(No.20201192-01)the Fundamental Science on Radioactive Geology and Exploration Technology Laboratory(No.2022RGET20)。
文摘A passive neutron multiplicity measurement device,FH-NCM/S1,based on field-programmable gate arrays(FPGAs),is developed specifically for measuring the mass of plutonium-240(^(240)Pu)in mixed oxide fuel.FH-NCM/S1 adopts an inte-grated approach,combining the shift register analysis mode with the pulse-position timestamp mode using an FPGA.The optimal effective length of the^(3)He neutron detector was determined to be 30 cm,and the thickness of the graphite reflector was ascertained to be 15 cm through MCNP simulations.After fabricating the device,calibration measurements were per-formed using a^(252)Cf neutron source;a detection efficiency of 43.07%and detector die-away time of 55.79μs were observed.Nine samples of plutonium oxide were measured under identical conditions using the FH-NCM/S1 in shift register analysis mode and a plutonium waste multiplicity counter.The obtained double rates underwent corrections for detection efficiency(ε)and double gate fraction(f_(d)),resulting in corrected double rates(D_(c)),which were used to validate the accuracy of the shift register analysis mode.Furthermore,the device exhibited fluctuations in the measurement results,and within a single 20 s measurement,these fluctuations remained below 10%.After 30 cycles,the relative error in the mass of^(240)Pu was less than 5%.Finally,correlation calculations confirmed the robust consistency of both measurement modes.This study holds specific significance for the subsequent design and development of neutron multiplicity devices.
基金financially supported by the National Council for Scientific and Technological Development(CNPq,Brazil),Swedish-Brazilian Research and Innovation Centre(CISB),and Saab AB under Grant No.CNPq:200053/2022-1the National Council for Scientific and Technological Development(CNPq,Brazil)under Grants No.CNPq:312924/2017-8 and No.CNPq:314660/2020-8.
文摘Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for this process is to combine inertial navigation system sensor information with the global navigation satellite system(GNSS)signal.However,some factors can interfere with the GNSS signal,such as ionospheric scintillation,jamming,or spoofing.One alternative method to avoid using the GNSS signal is to apply an image processing approach by matching UAV images with georeferenced images.But a high effort is required for image edge extraction.Here a support vector regression(SVR)model is proposed to reduce this computational load and processing time.The dynamic partial reconfiguration(DPR)of part of the SVR datapath is implemented to accelerate the process,reduce the area,and analyze its granularity by increasing the grain size of the reconfigurable region.Results show that the implementation in hardware is 68 times faster than that in software.This architecture with DPR also facilitates the low power consumption of 4 mW,leading to a reduction of 57%than that without DPR.This is also the lowest power consumption in current machine learning hardware implementations.Besides,the circuitry area is 41 times smaller.SVR with Gaussian kernel shows a success rate of 99.18%and minimum square error of 0.0146 for testing with the planning trajectory.This system is useful for adaptive applications where the user/designer can modify/reconfigure the hardware layout during its application,thus contributing to lower power consumption,smaller hardware area,and shorter execution time.
文摘In this paper, a modified FPGA scheme for the convolutional encoder and Viterbi decoder based on the IEEE 802.11a standards of WLAN is presented in OFDM baseband processing systems. The proposed design supports a generic, robust and configurable Viterbi decoder with constraint length of 7, code rate of 1/2 and decoding depth of 36 symbols. The Viterbi decoder uses full-parallel structure to improve computational speed for the add-compare-select (ACS) modules, adopts optimal data storage mechanism to avoid overflow and employs three distributed RAM blocks to complete cyclic trace-back. It includes the core parts, for example, the state path measure computation, the preservation and transfer of the survivor path and trace-back decoding, etc. Compared to the general Viterbi decoder, this design can effectively decrease the 10% of chip logic elements, reduce 5% of power consumption, and increase the encoder and decoder working performance in the hardware implementation. Lastly, relevant simulation results using Verilog HDL language are verified based on a Xinlinx Virtex-II FPGA by ISE 7.1i. It is shown that the Viterbi decoder is capable of decoding (2, 1, 7) convolutional codes accurately with a throughput of 80 Mbps.