The SubBytes (S-box) transformation is the most crucial operation in the AES algorithm, significantly impacting the implementation performance of AES chips. To design a high-performance S-box, a segmented optimization...The SubBytes (S-box) transformation is the most crucial operation in the AES algorithm, significantly impacting the implementation performance of AES chips. To design a high-performance S-box, a segmented optimization implementation of the S-box is proposed based on the composite field inverse operation in this paper. This proposed S-box implementation is modeled using Verilog language and synthesized using Design Complier software under the premise of ensuring the correctness of the simulation result. The synthesis results show that, compared to several current S-box implementation schemes, the proposed implementation of the S-box significantly reduces the area overhead and critical path delay, then gets higher hardware efficiency. This provides strong support for realizing efficient and compact S-box ASIC designs.展开更多
Chaotic systems have been intensively studied for their roles in many applications, such as cryptography, secure communications, nonlinear controls, etc. However, the limited complexity of existing chaotic systems wea...Chaotic systems have been intensively studied for their roles in many applications, such as cryptography, secure communications, nonlinear controls, etc. However, the limited complexity of existing chaotic systems weakens chaos-based practical applications. Designing chaotic maps with high complexity is attractive. This paper proposes the exponential sine chaotification model(ESCM), a method of using the exponential sine function as a nonlinear transform model, to enhance the complexity of chaotic maps. To verify the performance of the ESCM, we firstly demonstrated it through theoretical analysis. Then, to exhibit the high efficiency and usability of ESCM, we applied ESCM to one-dimensional(1D) and multidimensional(MD) chaotic systems. The effects were examined by the Lyapunov exponent and it was found that enhanced chaotic maps have much more complicated dynamic behaviors compared to their originals. To validate the simplicity of ESCM in hardware implementation, we simulated three enhanced chaotic maps using a digital signal processor(DSP). To explore the ESCM in practical application, we applied ESCM to image encryption. The results verified that the ESCM can make previous chaos maps competitive for usage in image encryption.展开更多
This article proposes an approach to the formalization of tasks and conditions for the hardware implementation of quasi-continuous observation devices with discrete receivers in remote sensing systems.Observation devi...This article proposes an approach to the formalization of tasks and conditions for the hardware implementation of quasi-continuous observation devices with discrete receivers in remote sensing systems.Observation devices with a matrix are used in medicine,ecology,aerospace photography,and geodesy,among other fields.In the discrete receivers,the sampling of an image in the matrix receiver into pixels leads to a decrease in the spatial information of the object.In a greater extent,these disadvantages can be avoided by using photosensitive matrix with a regularly changing(controlled)density of elementary receivers-matrix(RCDOER-matrix).Currently,there is no substantiation of the tasks and conditions for the hardware implementation of RCDOER-matrix.The algorithmic formation of a quasi-continuous image of observation devices with the RCDOER-matrix is proposed.The algorithm used a formal pixel-by-pixel description of the signals in the image.This algorithm formalizes the requirements for creating a photosensitive RCDOER-matrix of a certain size,as well as for changing the mechanism for forming and saving a frame with observation results.The application of the developed method will allow multiplying the pixel size of the image relative to the pixel size of the RCDOER-matrix.Developed algorithms for RCDOER-matrix are supplemented by formalizing the tasks that arise when creating prototypes.In addition,the conditions for hardware implementation are proposed,which ensure the completeness of registration of the observation picture,and allow avoiding excessive pixel measurements.Thus,the results of the research carried out approximate the practical application of RCDOER-matrix.展开更多
This article implements maximum power point tracking(MPPT)based on the improved hill-climbing algorithm for photovoltaic(PV)systems feeding resistive loads.A direct current-to-direct current boost converter is inserte...This article implements maximum power point tracking(MPPT)based on the improved hill-climbing algorithm for photovoltaic(PV)systems feeding resistive loads.A direct current-to-direct current boost converter is inserted between the PV system and the load to achieve matching.The converter is managed using MPPT based on the hill-climbing algorithm.The objective of this paper is to optimize the code program to achieve the best compromise between accuracy and rapidity by implementing this algorithm using a microcontroller.Two PV systems are tested under identical meteorological conditions.In the first,an improved hill-climbing MPPT controller is used whereas,in the second,the conventional version is employed.The experimental results obtained show a signifi-cant enhancement in terms of speed for the improved algorithm with a value of 0.4 s for the response time and 3%for the oscillation power;those values remain satisfactory in terms of precision of the algorithm compared with the conventional system studied and the compared algorithm from the literature.展开更多
For polar codes,the performance of successive cancellation list(SCL)decoding is capable of approaching that of maximum likelihood decoding.However,the existing hardware architectures for the SCL decoding suffer from h...For polar codes,the performance of successive cancellation list(SCL)decoding is capable of approaching that of maximum likelihood decoding.However,the existing hardware architectures for the SCL decoding suffer from high hardware complexity due to calculating L decoding paths simultaneously,which are unfriendly to the devices with limited logical resources,such as field programmable gate arrays(FPGAs).In this paper,we propose a list-serial pipelined hardware architecture with low complexity for the SCL decoding,where the serial calculation and the pipelined operation are elegantly combined to strike a balance between the complexity and the latency.Moreover,we employ only one successive cancellation(SC)decoder core without L×L crossbars,and reduce the number of inputs of the metric sorter from 2L to L+2.Finally,the FPGA implementations show that the hardware resource consumption is significantly reduced with negligible decoding performance loss.展开更多
Memristors are extensively used to estimate the external electromagnetic stimulation and synapses for neurons.In this paper,two distinct scenarios,i.e.,an ideal memristor serves as external electromagnetic stimulation...Memristors are extensively used to estimate the external electromagnetic stimulation and synapses for neurons.In this paper,two distinct scenarios,i.e.,an ideal memristor serves as external electromagnetic stimulation and a locally active memristor serves as a synapse,are formulated to investigate the impact of a memristor on a two-dimensional Hindmarsh-Rose neuron model.Numerical simulations show that the neuronal models in different scenarios have multiple burst firing patterns.The introduction of the memristor makes the neuronal model exhibit complex dynamical behaviors.Finally,the simulation circuit and DSP hardware implementation results validate the physical mechanism,as well as the reliability of the biological neuron model.展开更多
Rapid single flux quantum(RSFQ)circuits are a kind of superconducting digital circuits,having properties of a natural gate-level pipelining synchronous sequential circuit,which demonstrates high energy efficiency and ...Rapid single flux quantum(RSFQ)circuits are a kind of superconducting digital circuits,having properties of a natural gate-level pipelining synchronous sequential circuit,which demonstrates high energy efficiency and high throughput advantage.We find that the high-throughput and high-speed performance of RSFQ circuits can take the advantage of a hardware implementation of the encryption algorithm,whereas these are rarely applied to this field.Among the available encryption algorithms,the advanced encryption standard(AES)algorithm is an advanced encryption standard algorithm.It is currently the most widely used symmetric cryptography algorithm.In this work,we aim to demonstrate the SubByte operation of an AES-128 algorithm using RSFQ circuits based on the SIMIT Nb0_(3) process.We design an AES S-box circuit in the RSFQ logic,and compare its operational frequency,power dissipation,and throughput with those of the CMOS-based circuit post-simulated in the same structure.The complete RSFQ S-box circuit costs a total of 42237 Josephson junctions with nearly 130 Gbps throughput under the maximum simulated frequency of 16.28 GHz.Our analysis shows that the frequency and throughput of the RSFQ-based S-box are about four times higher than those of the CMOS-based S-box.Further,we design and fabricate a few typical modules of the S-box.Subsequent measurements demonstrate the correct functioning of the modules in both low and high frequencies up to 28.8 GHz.展开更多
This paper describes two single-chip——complex programmable logic devices/field programmable gate arrays(CPLD/FPGA)——implementations of the new advanced encryption standard (AES) algorithm based on the basic iterat...This paper describes two single-chip——complex programmable logic devices/field programmable gate arrays(CPLD/FPGA)——implementations of the new advanced encryption standard (AES) algorithm based on the basic iteration architecture (design [A]) and the hybrid pipelining architecture (design [B]). Design [A] is an encryption-and-decryption implementation based on the basic iteration architecture. This design not only supports 128-bit, 192-bit, 256-bit keys, but saves hardware resources because of the iteration architecture and sharing technology. Design [B] is a method of the 2×2 hybrid pipelining architecture. Based on the AES interleaved mode of operation, the design successfully accomplishes the algorithm, which operates in the feedback mode (cipher block chaining). It not only guarantees security of encryption/decryption, but obtains high data throughput of 1.05 Gb/s. The two designs have been realized on Aitera′s EP20k300EBC652-1 devices.展开更多
The Advanced Encryption Standard cryptographic algorithm,named AES,is implemented in cryptographic circuits to ensure high security level to any system which required confidentiality and secure information exchange.On...The Advanced Encryption Standard cryptographic algorithm,named AES,is implemented in cryptographic circuits to ensure high security level to any system which required confidentiality and secure information exchange.One of the most effective physical attacks against the hardware implementation of AES is fault attacks which can extract secret data.Until now,a several AES fault detection schemes against fault injection attacks have been proposed.In this paper,so as to ensure a high level of security against fault injection attacks,a new efficient fault detection scheme based on the AES architecture modification has been proposed.For this reason,the AES 32-bit round is divided into two half rounds and input and pipeline registers are implemented between them.The proposed scheme is independent of the procedure the AES is implemented.Thus,it can be implemented to secure the pipeline and iterative architectures.To evaluate the robustness of the proposed fault detection scheme against fault injection attacks,we conduct a transient and permanent fault attacks and then we determine the fault detection capability;it is about 99.88585%and 99.9069%for transient and permanent faults respectively.We have modeled the AES fault detection scheme using VHDL hardware language and through hardware FPGA implementation.The FPGA results demonstrate that our scheme can efficiently protect the AES hardware implementation against fault attacks.It can be simply implemented with low complexity.In addition,the FPGA implementation performances prove the low area overhead and the high efficiency and working frequency for the proposed AES detection scheme.展开更多
In order to satisfy the ever-increasing energy appetite of the massive battery-powered and batteryless communication devices,radio frequency(RF)signals have been relied upon for transferring wireless power to them.The...In order to satisfy the ever-increasing energy appetite of the massive battery-powered and batteryless communication devices,radio frequency(RF)signals have been relied upon for transferring wireless power to them.The joint coordination of wireless power transfer(WPT)and wireless information transfer(WIT)yields simultaneous wireless information and power transfer(SWIPT)as well as data and energy integrated communication network(DEIN).However,as a promising technique,few efforts are invested in the hardware implementation of DEIN.In order to make DEIN a reality,this paper focuses on hardware implementation of a DEIN.It firstly provides a brief tutorial on SWIPT,while summarising the latest hardware design of WPT transceiver and the existing commercial solutions.Then,a prototype design in DEIN with full protocol stack is elaborated,followed by its performance evaluation.展开更多
In this paper, a novel Medium Access Control (MAC) protocol for industrial Wireless Local Area Networks (WLANs) is proposed and studied. The main challenge in industry automation systems is the ultra-low network laten...In this paper, a novel Medium Access Control (MAC) protocol for industrial Wireless Local Area Networks (WLANs) is proposed and studied. The main challenge in industry automation systems is the ultra-low network latency with a target upper bound in the order of 1 ms while maintaining high network reliability and availability. The novelty of the proposed wireless MAC protocol resides in its similar latency performance as its counterpart in wired industrial LAN. First, the functional design of the MAC protocol is introduced. Then its performance results gained from hardware implementation (SystemC and VHDL) on an FPGA platform are presented. Finally, a real-time communication module which achieves the ultra-low latency required in industrial automation is described.展开更多
Labeling of the connected components is the key operation of the target recognition and segmentation in remote sensing images.The conventional connected-component labeling(CCL) algorithms for ordinary optical images a...Labeling of the connected components is the key operation of the target recognition and segmentation in remote sensing images.The conventional connected-component labeling(CCL) algorithms for ordinary optical images are considered time-consuming in processing the remote sensing images because of the larger size.A dynamic run-length based CCL algorithm(Dy RLC) is proposed in this paper for the large size,big granularity sparse remote sensing image,such as space debris images and ship images.In addition,the equivalence matrix method is proposed to help design the pre-processing method to accelerate the equivalence labels resolving.The result shows our algorithm outperforms 22.86% on execution time than the other algorithms in space debris image dataset.The proposed algorithm also can be implemented on the field programming logical array(FPGA) to enable the realization of the real-time processing on-board.展开更多
The Chinese hash algorithm SM3 is verified to be secure enough,but improper hardware implementation may lead to leakage.A masking scheme for SM3 algorithm is proposed to ensure the security of SM3 based Message Authen...The Chinese hash algorithm SM3 is verified to be secure enough,but improper hardware implementation may lead to leakage.A masking scheme for SM3 algorithm is proposed to ensure the security of SM3 based Message Authentication Code(MAC).Our scheme was implemented in hardware,which utilizes hardware oriented secure conversion techniques between boolean and arithmetic masking.Security evaluation based on SAKURA-G FPGA board has been done with 2000 power traces from 2000 random plaintexts with random plaintext masks and random key masks.It has been verified that the masked SM3 hardware implementation shows no intermediate value leakage as expected.Our masked SM3 hardware can resist first-order correlation power attack(CPA) and collision correlation attack.展开更多
Recently, trimming Soft-output Viterbi algorithm(T-SOVA) has been proposed to reduce the complexity of SOVA for Turbo codes. In its fi rst stage, a dynamic algorithm, lazy Viterbi algorithm, is used to indicate the mi...Recently, trimming Soft-output Viterbi algorithm(T-SOVA) has been proposed to reduce the complexity of SOVA for Turbo codes. In its fi rst stage, a dynamic algorithm, lazy Viterbi algorithm, is used to indicate the minimal metric differences which brings obstacle on hardware implementation. This paper proposes a Viterbi algorithm(VA) based T-SOVA to facilitate hardware implementation. In the first stage of our scheme, a modified VA with regular structure is used to fi nd the maximum likelihood(ML) path and calculate the metric differences. Further, local sorting is introduced to trim the metric differences, which reduces the complexity of trimming operation. Simulation results and complexity analysis show that VA based T-SOVA performs as well as lazy VA based T-SOVA and is easier to be applied to hardware implementation.展开更多
Chaotic systems are an effective tool for various applications, including information security and internet of things. Many chaotic systems may have the weaknesses of incomplete output distributions, discontinuous cha...Chaotic systems are an effective tool for various applications, including information security and internet of things. Many chaotic systems may have the weaknesses of incomplete output distributions, discontinuous chaotic regions, and simple chaotic behaviors.These may result in many negative influences in practical applications utilizing chaos. To deal with these issues, this study introduces a modular chaotification model(MCM) to increase the dynamic properties of current one-dimensional(1 D) chaotic maps. To exhibit the effect of the MCM, three 1 D chaotic maps are improved using the MCM as examples. Studies of the resulting properties show the robust and complex dynamics of these improved chaotic maps. Moreover, we implement these improved chaotic maps of MCM in a field-programmable gate array hardware platform and apply them to the application of PRNG. Performance analyses verify that these chaotic maps improved by the MCM have more complicated chaotic behaviors and wider chaotic ranges than the existing and several new chaotic maps.展开更多
In video applications, real-time image scaling techniques are often required. In this paper, an efficient implementation of a scaling engine based on 4×4 cubic convolution is proposed. The cubic convolution has a...In video applications, real-time image scaling techniques are often required. In this paper, an efficient implementation of a scaling engine based on 4×4 cubic convolution is proposed. The cubic convolution has a better performance than other traditional interpolation kernels and can also be realized on hardware. The engine is designed to perform arbitrary scaling ratios with an image resolution smaller than 2560× 1920 pixels and can scale up or down, in horizontal or vertical direction. It is composed of four fimctional units and five line buffers, which makes it more competitive than conventional architectures. A strict fixed-point strategy is applied to minimize the quantization errors of hardware realization. Experimental results show that the engine provides a better image quality and a comparatively lower hardware cost than reference implementations.展开更多
We propose a novel high-performance hardware architecture of processor for elliptic curve scalar multiplication based on the Lopez-Dahab algorithm over GF(2^163) in polynomial basis representation. The processor can...We propose a novel high-performance hardware architecture of processor for elliptic curve scalar multiplication based on the Lopez-Dahab algorithm over GF(2^163) in polynomial basis representation. The processor can do all the operations using an efficient modular arithmetic logic unit, which includes an addition unit, a square and a carefully designed multiplication unit. In the proposed architecture, multiplication, addition, and square can be performed in parallel by the decomposition of computation. The point addition and point doubling iteration operations can be performed in six multiplications by optimization and solution of data dependency. The implementation results based on Xilinx VirtexⅡ XC2V6000 FPGA show that the proposed design can do random elliptic curve scalar multiplication GF(2^163) in 34.11 μs, occupying 2821 registers and 13 376 LUTs.展开更多
This research investigates the digital-to-analog converter(DAC)free architecture for the digital reconfigurable intelligent surface(RIS)system,where the transmission lines are implemented for reflection coefficient(RC...This research investigates the digital-to-analog converter(DAC)free architecture for the digital reconfigurable intelligent surface(RIS)system,where the transmission lines are implemented for reflection coefficient(RC)control to reduce power consumption.In the proposed architecture,the radio frequency(RF)switch based phase shifter is considered.By using a single-pole four-throw(SP4T)switch to simultaneously control the RCs of a group of elements,a 2-bit phase shifter is realized for passive beam steering.A novel modulation scheme is developed to explore the cost effectiveness,which approaches the performance of traditional quadrature amplitude modulation(QAM).Specifically,to overcome the limitation of the phase shift bits,joint frequency-shift and phase-rotation operations are applied to the constellation points.The simulation and experimental results demonstrate that the proposed architecture is capable of providing an ideal transmission performance.Moreover,64-and 256-QAM modulation schemes could be implemented by expanding the elements and phase bits.展开更多
An optimization method of error detection and correction(EDAC) circuit design is proposed. The method involves selecting or constructing EDAC codes of low cost hardware, associated with operation scheduling implemen...An optimization method of error detection and correction(EDAC) circuit design is proposed. The method involves selecting or constructing EDAC codes of low cost hardware, associated with operation scheduling implementation based on 2-input XOR gates structure, and two actions for reducing hardware cells, which can reduce the delay penalties and area costs of the EDAC circuit effectively. The 32-bit EDAC circuit hardware implementation is selected to make a prototype, based on the 180 nm process. The delay penalties and area costs of the EDAC circuit are evaluated. Results show that the time penalty and area cost of the EDAC circuitries are affected with different parity-check matrices and different hardware implementation for the EDAC codes with the same capability of correction and detection code. This method can be used as a guide for low-cost radiation-hardened microprocessor EDAC circuit design and for more advanced technologies.展开更多
The application of cellular neural networks (CNN) for solving partial differential equations (PDEs) is investigated in this paper. Two kinds of the PDEs , the heat conduction equation and Poisson's ...The application of cellular neural networks (CNN) for solving partial differential equations (PDEs) is investigated in this paper. Two kinds of the PDEs , the heat conduction equation and Poisson's equation,are considered to be typical examples. They can be computed in real time by using the CNN ,while the CNN' s hardware is implemented by the integrated OP AMP . The experimental results show that the hardware performence is in agreement with that given by the computer simulation. Therefore,the CNN is a new powerful tool for solving PDEs.展开更多
文摘The SubBytes (S-box) transformation is the most crucial operation in the AES algorithm, significantly impacting the implementation performance of AES chips. To design a high-performance S-box, a segmented optimization implementation of the S-box is proposed based on the composite field inverse operation in this paper. This proposed S-box implementation is modeled using Verilog language and synthesized using Design Complier software under the premise of ensuring the correctness of the simulation result. The synthesis results show that, compared to several current S-box implementation schemes, the proposed implementation of the S-box significantly reduces the area overhead and critical path delay, then gets higher hardware efficiency. This provides strong support for realizing efficient and compact S-box ASIC designs.
基金Project supported by the National Natural Science Foundation of China (Grant No. 51507023)Chongqing Municipal Natural Science Foundation (Grant No. cstc2020jcyjmsxm X0726)the Science and Technology Research Program of Chongqing Municipal Education Commission (Grant No. KJZD-K202100506)。
文摘Chaotic systems have been intensively studied for their roles in many applications, such as cryptography, secure communications, nonlinear controls, etc. However, the limited complexity of existing chaotic systems weakens chaos-based practical applications. Designing chaotic maps with high complexity is attractive. This paper proposes the exponential sine chaotification model(ESCM), a method of using the exponential sine function as a nonlinear transform model, to enhance the complexity of chaotic maps. To verify the performance of the ESCM, we firstly demonstrated it through theoretical analysis. Then, to exhibit the high efficiency and usability of ESCM, we applied ESCM to one-dimensional(1D) and multidimensional(MD) chaotic systems. The effects were examined by the Lyapunov exponent and it was found that enhanced chaotic maps have much more complicated dynamic behaviors compared to their originals. To validate the simplicity of ESCM in hardware implementation, we simulated three enhanced chaotic maps using a digital signal processor(DSP). To explore the ESCM in practical application, we applied ESCM to image encryption. The results verified that the ESCM can make previous chaos maps competitive for usage in image encryption.
文摘This article proposes an approach to the formalization of tasks and conditions for the hardware implementation of quasi-continuous observation devices with discrete receivers in remote sensing systems.Observation devices with a matrix are used in medicine,ecology,aerospace photography,and geodesy,among other fields.In the discrete receivers,the sampling of an image in the matrix receiver into pixels leads to a decrease in the spatial information of the object.In a greater extent,these disadvantages can be avoided by using photosensitive matrix with a regularly changing(controlled)density of elementary receivers-matrix(RCDOER-matrix).Currently,there is no substantiation of the tasks and conditions for the hardware implementation of RCDOER-matrix.The algorithmic formation of a quasi-continuous image of observation devices with the RCDOER-matrix is proposed.The algorithm used a formal pixel-by-pixel description of the signals in the image.This algorithm formalizes the requirements for creating a photosensitive RCDOER-matrix of a certain size,as well as for changing the mechanism for forming and saving a frame with observation results.The application of the developed method will allow multiplying the pixel size of the image relative to the pixel size of the RCDOER-matrix.Developed algorithms for RCDOER-matrix are supplemented by formalizing the tasks that arise when creating prototypes.In addition,the conditions for hardware implementation are proposed,which ensure the completeness of registration of the observation picture,and allow avoiding excessive pixel measurements.Thus,the results of the research carried out approximate the practical application of RCDOER-matrix.
文摘This article implements maximum power point tracking(MPPT)based on the improved hill-climbing algorithm for photovoltaic(PV)systems feeding resistive loads.A direct current-to-direct current boost converter is inserted between the PV system and the load to achieve matching.The converter is managed using MPPT based on the hill-climbing algorithm.The objective of this paper is to optimize the code program to achieve the best compromise between accuracy and rapidity by implementing this algorithm using a microcontroller.Two PV systems are tested under identical meteorological conditions.In the first,an improved hill-climbing MPPT controller is used whereas,in the second,the conventional version is employed.The experimental results obtained show a signifi-cant enhancement in terms of speed for the improved algorithm with a value of 0.4 s for the response time and 3%for the oscillation power;those values remain satisfactory in terms of precision of the algorithm compared with the conventional system studied and the compared algorithm from the literature.
基金supported in part by the National Key R&D Program of China(No.2019YFB1803400)。
文摘For polar codes,the performance of successive cancellation list(SCL)decoding is capable of approaching that of maximum likelihood decoding.However,the existing hardware architectures for the SCL decoding suffer from high hardware complexity due to calculating L decoding paths simultaneously,which are unfriendly to the devices with limited logical resources,such as field programmable gate arrays(FPGAs).In this paper,we propose a list-serial pipelined hardware architecture with low complexity for the SCL decoding,where the serial calculation and the pipelined operation are elegantly combined to strike a balance between the complexity and the latency.Moreover,we employ only one successive cancellation(SC)decoder core without L×L crossbars,and reduce the number of inputs of the metric sorter from 2L to L+2.Finally,the FPGA implementations show that the hardware resource consumption is significantly reduced with negligible decoding performance loss.
基金supported by the National Natural Science Foundation of China(Grant No.62061014)Technological Innovation Projects in the Field of Artificial Intelligence in Liaoning province(Grant No.2023JH26/10300011)Basic Scientific Research Projects in Department of Education of Liaoning Province(Grant No.JYTZD2023021).
文摘Memristors are extensively used to estimate the external electromagnetic stimulation and synapses for neurons.In this paper,two distinct scenarios,i.e.,an ideal memristor serves as external electromagnetic stimulation and a locally active memristor serves as a synapse,are formulated to investigate the impact of a memristor on a two-dimensional Hindmarsh-Rose neuron model.Numerical simulations show that the neuronal models in different scenarios have multiple burst firing patterns.The introduction of the memristor makes the neuronal model exhibit complex dynamical behaviors.Finally,the simulation circuit and DSP hardware implementation results validate the physical mechanism,as well as the reliability of the biological neuron model.
基金This work was supported by the National Natural Science Foundation of China(Grant No.92164101)the National Natural Science Foundation of China(Grant No.62171437)+2 种基金the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant No.XDA18000000)Shanghai Science and Technology Committee(Grant No.21DZ1101000)the National Key R&D Program of China(Grant No.2021YFB0300400).
文摘Rapid single flux quantum(RSFQ)circuits are a kind of superconducting digital circuits,having properties of a natural gate-level pipelining synchronous sequential circuit,which demonstrates high energy efficiency and high throughput advantage.We find that the high-throughput and high-speed performance of RSFQ circuits can take the advantage of a hardware implementation of the encryption algorithm,whereas these are rarely applied to this field.Among the available encryption algorithms,the advanced encryption standard(AES)algorithm is an advanced encryption standard algorithm.It is currently the most widely used symmetric cryptography algorithm.In this work,we aim to demonstrate the SubByte operation of an AES-128 algorithm using RSFQ circuits based on the SIMIT Nb0_(3) process.We design an AES S-box circuit in the RSFQ logic,and compare its operational frequency,power dissipation,and throughput with those of the CMOS-based circuit post-simulated in the same structure.The complete RSFQ S-box circuit costs a total of 42237 Josephson junctions with nearly 130 Gbps throughput under the maximum simulated frequency of 16.28 GHz.Our analysis shows that the frequency and throughput of the RSFQ-based S-box are about four times higher than those of the CMOS-based S-box.Further,we design and fabricate a few typical modules of the S-box.Subsequent measurements demonstrate the correct functioning of the modules in both low and high frequencies up to 28.8 GHz.
文摘This paper describes two single-chip——complex programmable logic devices/field programmable gate arrays(CPLD/FPGA)——implementations of the new advanced encryption standard (AES) algorithm based on the basic iteration architecture (design [A]) and the hybrid pipelining architecture (design [B]). Design [A] is an encryption-and-decryption implementation based on the basic iteration architecture. This design not only supports 128-bit, 192-bit, 256-bit keys, but saves hardware resources because of the iteration architecture and sharing technology. Design [B] is a method of the 2×2 hybrid pipelining architecture. Based on the AES interleaved mode of operation, the design successfully accomplishes the algorithm, which operates in the feedback mode (cipher block chaining). It not only guarantees security of encryption/decryption, but obtains high data throughput of 1.05 Gb/s. The two designs have been realized on Aitera′s EP20k300EBC652-1 devices.
文摘The Advanced Encryption Standard cryptographic algorithm,named AES,is implemented in cryptographic circuits to ensure high security level to any system which required confidentiality and secure information exchange.One of the most effective physical attacks against the hardware implementation of AES is fault attacks which can extract secret data.Until now,a several AES fault detection schemes against fault injection attacks have been proposed.In this paper,so as to ensure a high level of security against fault injection attacks,a new efficient fault detection scheme based on the AES architecture modification has been proposed.For this reason,the AES 32-bit round is divided into two half rounds and input and pipeline registers are implemented between them.The proposed scheme is independent of the procedure the AES is implemented.Thus,it can be implemented to secure the pipeline and iterative architectures.To evaluate the robustness of the proposed fault detection scheme against fault injection attacks,we conduct a transient and permanent fault attacks and then we determine the fault detection capability;it is about 99.88585%and 99.9069%for transient and permanent faults respectively.We have modeled the AES fault detection scheme using VHDL hardware language and through hardware FPGA implementation.The FPGA results demonstrate that our scheme can efficiently protect the AES hardware implementation against fault attacks.It can be simply implemented with low complexity.In addition,the FPGA implementation performances prove the low area overhead and the high efficiency and working frequency for the proposed AES detection scheme.
基金financial support of National Natural Science Foundation of China(NSFC),No.U1705263 and 61971102GF Innovative Research Programthe Sichuan Science and Technology Program,No.2019YJ0194。
文摘In order to satisfy the ever-increasing energy appetite of the massive battery-powered and batteryless communication devices,radio frequency(RF)signals have been relied upon for transferring wireless power to them.The joint coordination of wireless power transfer(WPT)and wireless information transfer(WIT)yields simultaneous wireless information and power transfer(SWIPT)as well as data and energy integrated communication network(DEIN).However,as a promising technique,few efforts are invested in the hardware implementation of DEIN.In order to make DEIN a reality,this paper focuses on hardware implementation of a DEIN.It firstly provides a brief tutorial on SWIPT,while summarising the latest hardware design of WPT transceiver and the existing commercial solutions.Then,a prototype design in DEIN with full protocol stack is elaborated,followed by its performance evaluation.
基金funding from the German Federal Ministry for Education and Research(2015-2017)under the grant agreement No.16KIS0179 also referred as DEAL
文摘In this paper, a novel Medium Access Control (MAC) protocol for industrial Wireless Local Area Networks (WLANs) is proposed and studied. The main challenge in industry automation systems is the ultra-low network latency with a target upper bound in the order of 1 ms while maintaining high network reliability and availability. The novelty of the proposed wireless MAC protocol resides in its similar latency performance as its counterpart in wired industrial LAN. First, the functional design of the MAC protocol is introduced. Then its performance results gained from hardware implementation (SystemC and VHDL) on an FPGA platform are presented. Finally, a real-time communication module which achieves the ultra-low latency required in industrial automation is described.
文摘Labeling of the connected components is the key operation of the target recognition and segmentation in remote sensing images.The conventional connected-component labeling(CCL) algorithms for ordinary optical images are considered time-consuming in processing the remote sensing images because of the larger size.A dynamic run-length based CCL algorithm(Dy RLC) is proposed in this paper for the large size,big granularity sparse remote sensing image,such as space debris images and ship images.In addition,the equivalence matrix method is proposed to help design the pre-processing method to accelerate the equivalence labels resolving.The result shows our algorithm outperforms 22.86% on execution time than the other algorithms in space debris image dataset.The proposed algorithm also can be implemented on the field programming logical array(FPGA) to enable the realization of the real-time processing on-board.
基金supported by the National Major Program "Core of Electronic Devices,High-End General Chips,and Basis of Software Products" of the Ministry of Industry and Information Technology of China (Nos.2014ZX01032205,2014ZX01032401001-Z05)the National Natural Science Foundation of China(No.61402252) "12th Five-Year Plan" The National Development Foundation for Cryptological Research(No. MMJJ201401009)
文摘The Chinese hash algorithm SM3 is verified to be secure enough,but improper hardware implementation may lead to leakage.A masking scheme for SM3 algorithm is proposed to ensure the security of SM3 based Message Authentication Code(MAC).Our scheme was implemented in hardware,which utilizes hardware oriented secure conversion techniques between boolean and arithmetic masking.Security evaluation based on SAKURA-G FPGA board has been done with 2000 power traces from 2000 random plaintexts with random plaintext masks and random key masks.It has been verified that the masked SM3 hardware implementation shows no intermediate value leakage as expected.Our masked SM3 hardware can resist first-order correlation power attack(CPA) and collision correlation attack.
基金supported by NSAF under Grant(No.U1530117)National Natural Science Foundation of China(No.61471022)
文摘Recently, trimming Soft-output Viterbi algorithm(T-SOVA) has been proposed to reduce the complexity of SOVA for Turbo codes. In its fi rst stage, a dynamic algorithm, lazy Viterbi algorithm, is used to indicate the minimal metric differences which brings obstacle on hardware implementation. This paper proposes a Viterbi algorithm(VA) based T-SOVA to facilitate hardware implementation. In the first stage of our scheme, a modified VA with regular structure is used to fi nd the maximum likelihood(ML) path and calculate the metric differences. Further, local sorting is introduced to trim the metric differences, which reduces the complexity of trimming operation. Simulation results and complexity analysis show that VA based T-SOVA performs as well as lazy VA based T-SOVA and is easier to be applied to hardware implementation.
基金supported by the National Natural Science Foundation of China (Grant No. 62071142)the Natural Scientific Research Innovation Foundation in Harbin Institute of Technology (Grant No. HIT.NSRIF.2020077)。
文摘Chaotic systems are an effective tool for various applications, including information security and internet of things. Many chaotic systems may have the weaknesses of incomplete output distributions, discontinuous chaotic regions, and simple chaotic behaviors.These may result in many negative influences in practical applications utilizing chaos. To deal with these issues, this study introduces a modular chaotification model(MCM) to increase the dynamic properties of current one-dimensional(1 D) chaotic maps. To exhibit the effect of the MCM, three 1 D chaotic maps are improved using the MCM as examples. Studies of the resulting properties show the robust and complex dynamics of these improved chaotic maps. Moreover, we implement these improved chaotic maps of MCM in a field-programmable gate array hardware platform and apply them to the application of PRNG. Performance analyses verify that these chaotic maps improved by the MCM have more complicated chaotic behaviors and wider chaotic ranges than the existing and several new chaotic maps.
基金supported by the National High-Tech R&D Program(863)of China(No.2009AA011706)the Fundamental Research Funds for the Central Universities(No.KYJD09012)
文摘In video applications, real-time image scaling techniques are often required. In this paper, an efficient implementation of a scaling engine based on 4×4 cubic convolution is proposed. The cubic convolution has a better performance than other traditional interpolation kernels and can also be realized on hardware. The engine is designed to perform arbitrary scaling ratios with an image resolution smaller than 2560× 1920 pixels and can scale up or down, in horizontal or vertical direction. It is composed of four fimctional units and five line buffers, which makes it more competitive than conventional architectures. A strict fixed-point strategy is applied to minimize the quantization errors of hardware realization. Experimental results show that the engine provides a better image quality and a comparatively lower hardware cost than reference implementations.
基金supported by the Hi-Tech Research and Development Program (863) of China (No. 2006AA01Z226)the Research Foun dation of Huazhong University of Science and Technology, China (No. 2006Z001B)
文摘We propose a novel high-performance hardware architecture of processor for elliptic curve scalar multiplication based on the Lopez-Dahab algorithm over GF(2^163) in polynomial basis representation. The processor can do all the operations using an efficient modular arithmetic logic unit, which includes an addition unit, a square and a carefully designed multiplication unit. In the proposed architecture, multiplication, addition, and square can be performed in parallel by the decomposition of computation. The point addition and point doubling iteration operations can be performed in six multiplications by optimization and solution of data dependency. The implementation results based on Xilinx VirtexⅡ XC2V6000 FPGA show that the proposed design can do random elliptic curve scalar multiplication GF(2^163) in 34.11 μs, occupying 2821 registers and 13 376 LUTs.
基金Project supported by the National Key R&D Program of China(No.2019YFB1803400)。
文摘This research investigates the digital-to-analog converter(DAC)free architecture for the digital reconfigurable intelligent surface(RIS)system,where the transmission lines are implemented for reflection coefficient(RC)control to reduce power consumption.In the proposed architecture,the radio frequency(RF)switch based phase shifter is considered.By using a single-pole four-throw(SP4T)switch to simultaneously control the RCs of a group of elements,a 2-bit phase shifter is realized for passive beam steering.A novel modulation scheme is developed to explore the cost effectiveness,which approaches the performance of traditional quadrature amplitude modulation(QAM).Specifically,to overcome the limitation of the phase shift bits,joint frequency-shift and phase-rotation operations are applied to the constellation points.The simulation and experimental results demonstrate that the proposed architecture is capable of providing an ideal transmission performance.Moreover,64-and 256-QAM modulation schemes could be implemented by expanding the elements and phase bits.
文摘An optimization method of error detection and correction(EDAC) circuit design is proposed. The method involves selecting or constructing EDAC codes of low cost hardware, associated with operation scheduling implementation based on 2-input XOR gates structure, and two actions for reducing hardware cells, which can reduce the delay penalties and area costs of the EDAC circuit effectively. The 32-bit EDAC circuit hardware implementation is selected to make a prototype, based on the 180 nm process. The delay penalties and area costs of the EDAC circuit are evaluated. Results show that the time penalty and area cost of the EDAC circuitries are affected with different parity-check matrices and different hardware implementation for the EDAC codes with the same capability of correction and detection code. This method can be used as a guide for low-cost radiation-hardened microprocessor EDAC circuit design and for more advanced technologies.
文摘The application of cellular neural networks (CNN) for solving partial differential equations (PDEs) is investigated in this paper. Two kinds of the PDEs , the heat conduction equation and Poisson's equation,are considered to be typical examples. They can be computed in real time by using the CNN ,while the CNN' s hardware is implemented by the integrated OP AMP . The experimental results show that the hardware performence is in agreement with that given by the computer simulation. Therefore,the CNN is a new powerful tool for solving PDEs.