The SubBytes (S-box) transformation is the most crucial operation in the AES algorithm, significantly impacting the implementation performance of AES chips. To design a high-performance S-box, a segmented optimization...The SubBytes (S-box) transformation is the most crucial operation in the AES algorithm, significantly impacting the implementation performance of AES chips. To design a high-performance S-box, a segmented optimization implementation of the S-box is proposed based on the composite field inverse operation in this paper. This proposed S-box implementation is modeled using Verilog language and synthesized using Design Complier software under the premise of ensuring the correctness of the simulation result. The synthesis results show that, compared to several current S-box implementation schemes, the proposed implementation of the S-box significantly reduces the area overhead and critical path delay, then gets higher hardware efficiency. This provides strong support for realizing efficient and compact S-box ASIC designs.展开更多
The single reference second order Brillouin-Wigner perturbation theory recently developed, which eliminates its size-extensivity error, has been generalized to state-specific, multi-reference (SS-MR), BWPT2 providin...The single reference second order Brillouin-Wigner perturbation theory recently developed, which eliminates its size-extensivity error, has been generalized to state-specific, multi-reference (SS-MR), BWPT2 providing a size-extensive correction to the electron correlation problem for systems that demand the use of a multi-reference function. Illustrative numerical tests of the size-extensivity corrections are made for widely used molecules in their ground states, which are pronounced multi-reference characteristics. We have implemented two-reference and three-reference cases for CH2, BH and bond breaking process in the ground states of HF molecules. The results are compared with the rigorously size-extensive methods such as the M^ller-Plesset perturbation theory, i.e., MP2, full configuration interaction (Full-CI) and allied methods using the same basis sets.展开更多
Surface distribution and seasonal variation of alkalinity and specific alkalinity in Kuroshio area of the East ChinaSea and their application to the water mass tracing are discussed in this paper. Results show a disti...Surface distribution and seasonal variation of alkalinity and specific alkalinity in Kuroshio area of the East ChinaSea and their application to the water mass tracing are discussed in this paper. Results show a distinct seasonal variation of the alkalinity, which is concerned with the process of vertical mixing. Different specific alkalinity in various water masses has been found. On the basis of the difference of the specific alkalinity and the distribution of alkalinity, two water fronts in summer season, located at 27°-30°N and 124°-1 27°E, (Ⅰ), and at the northern waters about one latitude from the Taiwan Island, (Ⅱ); one in winter season at about one longitude from coast of mainland of China and 26°-30°N were found. In summer season, about 1-2 longitudes eastward shift of front (Ⅰ) is found by comparison of data in May and August. And the high alkalinity of the northern East China Sea in summer season may be caused by the Huanghe River runoff flowing southward along with the Huanghai Sea Coastal Current.展开更多
A Taylor series expansion(TSE) based design for minimum mean-square error(MMSE) and QR decomposition(QRD) of multi-input and multi-output(MIMO) systems is proposed based on application specific instruction set process...A Taylor series expansion(TSE) based design for minimum mean-square error(MMSE) and QR decomposition(QRD) of multi-input and multi-output(MIMO) systems is proposed based on application specific instruction set processor(ASIP), which uses TSE algorithm instead of resource-consuming reciprocal and reciprocal square root(RSR) operations.The aim is to give a high performance implementation for MMSE and QRD in one programmable platform simultaneously.Furthermore, instruction set architecture(ISA) and the allocation of data paths in single instruction multiple data-very long instruction word(SIMD-VLIW) architecture are provided, offering more data parallelism and instruction parallelism for different dimension matrices and operation types.Meanwhile, multiple level numerical precision can be achieved with flexible table size and expansion order in TSE ISA.The ASIP has been implemented to a 28 nm CMOS process and frequency reaches 800 MHz.Experimental results show that the proposed design provides perfect numerical precision within the fixed bit-width of the ASIP, higher matrix processing rate better than the requirements of 5G system and more rate-area efficiency comparable with ASIC implementations.展开更多
An application specific integrated circuit (ASIC) design of a 1024 points floating-point fast Fourier transform(FFT) processor is presented. It can satisfy the requirement of high accuracy FFT result in related fields...An application specific integrated circuit (ASIC) design of a 1024 points floating-point fast Fourier transform(FFT) processor is presented. It can satisfy the requirement of high accuracy FFT result in related fields. Several novel design techniques for floating-point adder and multiplier are introduced in detail to enhance the speed of the system. At the same time, the power consumption is decreased. The hardware area is effectively reduced as an improved butterfly processor is developed. There is a substantial increase in the performance of the design since a pipelined architecture is adopted, and very large scale integrated (VLSI) is easy to realize due to the regularity. A result of validation using field programmable gate array (FPGA) is shown at the end. When the system clock is set to 50 MHz, 204.8 μs is needed to complete the operation of FFT computation.展开更多
This paper will provide some insights on the application of Field Programmable Gate Array (FPGA) in process tomography. The focus of this paper will be to investigate the performance of the technology with respect to ...This paper will provide some insights on the application of Field Programmable Gate Array (FPGA) in process tomography. The focus of this paper will be to investigate the performance of the technology with respect to various tomography systems and comparison to other similar technologies including the Application Specific Integrated Circuit (ASIC), Graphics Processing Unit (GPU) and the microcontroller. Fundamentally, the FPGA is primarily used in the Data Acquisition System (DAQ) due to its better performance and better trade-off as compared to competitor technologies. However, the drawback of using FPGA is that it is relatively more expensive.展开更多
The rapid development of multimedia techniques has increased the demands on multimedia processors. This paper presents a new design method to quickly design high performance processors for new multimedia applications....The rapid development of multimedia techniques has increased the demands on multimedia processors. This paper presents a new design method to quickly design high performance processors for new multimedia applications. In this approach, a configurable processor based on the very long instruction-set word architecture is used as the basic core for designers to easily configure new processor cores for multimedia algorithm. Specific instructions designed for multimedia applications efficiently improve the performance of the target processor. Functions not implemented in the digital signal processor (DSP) core can be easily integrated into the target processor as user-defined hardware to increase the performance. Several examples are given based on the architecture. The results show that the processor performance is enhanced approximately 4 times on the H.263 codec and that the processor outperforms both DSPs and single instruction multiple data (SIMD) multimedia extension architectures by up to 8 times when computing the 2-D-IDCT.展开更多
A low-power and low-cost advanced encryption standard (AES) coprocessor is proposed for Zigbee system-on-a-chip (SoC) design. The cost and power consumption of the proposed AES coprocessor are reduced considerably...A low-power and low-cost advanced encryption standard (AES) coprocessor is proposed for Zigbee system-on-a-chip (SoC) design. The cost and power consumption of the proposed AES coprocessor are reduced considerably by optimizing the architectures of SubBytes/InvSubBytes and MixColumns/InvMixColumns, integrating the encryption and decryption procedures together by the method of resource sharing, and using the hierarchical power management strategy based on finite state machine (FSM) and clock gating (CG) technologies. Based on SMIC 0.18 μm complementary metal oxide semiconductor (CMOS) technology, the scale of the AES coprocessor is only about 10.5 kgate, the corresponding power consumption is 69.1 μW/MHz, and the throughput is 32 Mb/s, which is reasonable and sufficient for Zigbee system. Compared with other designs, the proposed architecture consumes less power and fewer hardware resources, which is conducive to the Zigbee system and other portable devices.展开更多
As part of a recent analysis of exclusive two-photon production of W+W- pairs at the LHC, the CMS experiment used di-lepton data to obtain an "effective" photon-photon luminosity. We show how the CMS analysis on th...As part of a recent analysis of exclusive two-photon production of W+W- pairs at the LHC, the CMS experiment used di-lepton data to obtain an "effective" photon-photon luminosity. We show how the CMS analysis on their 8 TeV data, along with some assumptions about the likelihood for events in which the proton breaks up to pass the selection criteria, can be used to significantly constrain the photon parton distribution functions, such as those from the CTEQ, MRST, and NNPDF collaborations. We compare the data with predictions using these photon distributions, as well as the new LUXqed photon distribution. We study the impact of including these data on the NNPDF2.3QED, NNPDF3.0QED and CT14QEDinc fits. We find that these data place a useful and complementary cross-check on the photon distribution, which is consistent with the LUXqed prediction while suggesting that the NNPDF photon error band should be significantly reduced. Additionally, we propose a simple model for describing the two-photon production of W^+W^- at the LHC. Using this model, we constrain the number of inelastic photons that remain after the experimental cuts are applied.展开更多
文摘The SubBytes (S-box) transformation is the most crucial operation in the AES algorithm, significantly impacting the implementation performance of AES chips. To design a high-performance S-box, a segmented optimization implementation of the S-box is proposed based on the composite field inverse operation in this paper. This proposed S-box implementation is modeled using Verilog language and synthesized using Design Complier software under the premise of ensuring the correctness of the simulation result. The synthesis results show that, compared to several current S-box implementation schemes, the proposed implementation of the S-box significantly reduces the area overhead and critical path delay, then gets higher hardware efficiency. This provides strong support for realizing efficient and compact S-box ASIC designs.
基金Supported by the Scientific and Technological Research Council of Turkey(TUBITAK)under Grant No 2219-1/2013
文摘The single reference second order Brillouin-Wigner perturbation theory recently developed, which eliminates its size-extensivity error, has been generalized to state-specific, multi-reference (SS-MR), BWPT2 providing a size-extensive correction to the electron correlation problem for systems that demand the use of a multi-reference function. Illustrative numerical tests of the size-extensivity corrections are made for widely used molecules in their ground states, which are pronounced multi-reference characteristics. We have implemented two-reference and three-reference cases for CH2, BH and bond breaking process in the ground states of HF molecules. The results are compared with the rigorously size-extensive methods such as the M^ller-Plesset perturbation theory, i.e., MP2, full configuration interaction (Full-CI) and allied methods using the same basis sets.
文摘Surface distribution and seasonal variation of alkalinity and specific alkalinity in Kuroshio area of the East ChinaSea and their application to the water mass tracing are discussed in this paper. Results show a distinct seasonal variation of the alkalinity, which is concerned with the process of vertical mixing. Different specific alkalinity in various water masses has been found. On the basis of the difference of the specific alkalinity and the distribution of alkalinity, two water fronts in summer season, located at 27°-30°N and 124°-1 27°E, (Ⅰ), and at the northern waters about one latitude from the Taiwan Island, (Ⅱ); one in winter season at about one longitude from coast of mainland of China and 26°-30°N were found. In summer season, about 1-2 longitudes eastward shift of front (Ⅰ) is found by comparison of data in May and August. And the high alkalinity of the northern East China Sea in summer season may be caused by the Huanghe River runoff flowing southward along with the Huanghai Sea Coastal Current.
基金Supported by the Industrial Internet Innovation and Development Project of Ministry of Industry and Information Technology (No.GHBJ2004)。
文摘A Taylor series expansion(TSE) based design for minimum mean-square error(MMSE) and QR decomposition(QRD) of multi-input and multi-output(MIMO) systems is proposed based on application specific instruction set processor(ASIP), which uses TSE algorithm instead of resource-consuming reciprocal and reciprocal square root(RSR) operations.The aim is to give a high performance implementation for MMSE and QRD in one programmable platform simultaneously.Furthermore, instruction set architecture(ISA) and the allocation of data paths in single instruction multiple data-very long instruction word(SIMD-VLIW) architecture are provided, offering more data parallelism and instruction parallelism for different dimension matrices and operation types.Meanwhile, multiple level numerical precision can be achieved with flexible table size and expansion order in TSE ISA.The ASIP has been implemented to a 28 nm CMOS process and frequency reaches 800 MHz.Experimental results show that the proposed design provides perfect numerical precision within the fixed bit-width of the ASIP, higher matrix processing rate better than the requirements of 5G system and more rate-area efficiency comparable with ASIC implementations.
文摘An application specific integrated circuit (ASIC) design of a 1024 points floating-point fast Fourier transform(FFT) processor is presented. It can satisfy the requirement of high accuracy FFT result in related fields. Several novel design techniques for floating-point adder and multiplier are introduced in detail to enhance the speed of the system. At the same time, the power consumption is decreased. The hardware area is effectively reduced as an improved butterfly processor is developed. There is a substantial increase in the performance of the design since a pipelined architecture is adopted, and very large scale integrated (VLSI) is easy to realize due to the regularity. A result of validation using field programmable gate array (FPGA) is shown at the end. When the system clock is set to 50 MHz, 204.8 μs is needed to complete the operation of FFT computation.
文摘This paper will provide some insights on the application of Field Programmable Gate Array (FPGA) in process tomography. The focus of this paper will be to investigate the performance of the technology with respect to various tomography systems and comparison to other similar technologies including the Application Specific Integrated Circuit (ASIC), Graphics Processing Unit (GPU) and the microcontroller. Fundamentally, the FPGA is primarily used in the Data Acquisition System (DAQ) due to its better performance and better trade-off as compared to competitor technologies. However, the drawback of using FPGA is that it is relatively more expensive.
基金Supported by the National Natural Science Foundation of China (No. 60236020)the Specialized Research Fund for the Doctoral Program of Higher Education (No. 20050003083)
文摘The rapid development of multimedia techniques has increased the demands on multimedia processors. This paper presents a new design method to quickly design high performance processors for new multimedia applications. In this approach, a configurable processor based on the very long instruction-set word architecture is used as the basic core for designers to easily configure new processor cores for multimedia algorithm. Specific instructions designed for multimedia applications efficiently improve the performance of the target processor. Functions not implemented in the digital signal processor (DSP) core can be easily integrated into the target processor as user-defined hardware to increase the performance. Several examples are given based on the architecture. The results show that the processor performance is enhanced approximately 4 times on the H.263 codec and that the processor outperforms both DSPs and single instruction multiple data (SIMD) multimedia extension architectures by up to 8 times when computing the 2-D-IDCT.
基金supported by the National Natural Science Foundation of China(60676053)
文摘A low-power and low-cost advanced encryption standard (AES) coprocessor is proposed for Zigbee system-on-a-chip (SoC) design. The cost and power consumption of the proposed AES coprocessor are reduced considerably by optimizing the architectures of SubBytes/InvSubBytes and MixColumns/InvMixColumns, integrating the encryption and decryption procedures together by the method of resource sharing, and using the hierarchical power management strategy based on finite state machine (FSM) and clock gating (CG) technologies. Based on SMIC 0.18 μm complementary metal oxide semiconductor (CMOS) technology, the scale of the AES coprocessor is only about 10.5 kgate, the corresponding power consumption is 69.1 μW/MHz, and the throughput is 32 Mb/s, which is reasonable and sufficient for Zigbee system. Compared with other designs, the proposed architecture consumes less power and fewer hardware resources, which is conducive to the Zigbee system and other portable devices.
基金Supported by the U.S.National Science Foundation(PHY-1417326,PHY-1719914)the National Natural Science Foundation of China(11465018)
文摘As part of a recent analysis of exclusive two-photon production of W+W- pairs at the LHC, the CMS experiment used di-lepton data to obtain an "effective" photon-photon luminosity. We show how the CMS analysis on their 8 TeV data, along with some assumptions about the likelihood for events in which the proton breaks up to pass the selection criteria, can be used to significantly constrain the photon parton distribution functions, such as those from the CTEQ, MRST, and NNPDF collaborations. We compare the data with predictions using these photon distributions, as well as the new LUXqed photon distribution. We study the impact of including these data on the NNPDF2.3QED, NNPDF3.0QED and CT14QEDinc fits. We find that these data place a useful and complementary cross-check on the photon distribution, which is consistent with the LUXqed prediction while suggesting that the NNPDF photon error band should be significantly reduced. Additionally, we propose a simple model for describing the two-photon production of W^+W^- at the LHC. Using this model, we constrain the number of inelastic photons that remain after the experimental cuts are applied.