To solve the hardware deployment problem caused by the vast demanding computational complexity of convolutional layers and limited hardware resources for the hardware network inference,a look-up table(LUT)-based convo...To solve the hardware deployment problem caused by the vast demanding computational complexity of convolutional layers and limited hardware resources for the hardware network inference,a look-up table(LUT)-based convolution architecture built on a field-programmable gate array using integer multipliers and addition trees is used.With the help of the Winograd algorithm,the optimization of convolution and multiplication is realized to reduce the computational complexity.The LUT-based operator is further optimized to construct a processing unit(PE).Simultaneously optimized storage streams improve memory access efficiency and solve bandwidth constraints.The data toggle rate is reduced to optimize power consumption.The experimental results show that the use of the Winograd algorithm to build basic processing units can significantly reduce the number of multipliers and achieve hardware deployment acceleration,while the time-division multiplexing of processing units improves resource utilization.Under this experimental condition,compared with the traditional convolution method,the architecture optimizes computing resources by 2.25 times and improves the peak throughput by 19.3 times.The LUT-based Winograd accelerator can effectively solve the deployment problem caused by limited hardware resources.展开更多
The current exact Rayleigh scattering calculation of ocean color remote sensing uses the look-up table (LUT), which is usually created for a special remote sensor and cannot be applied to other sensors. For practica...The current exact Rayleigh scattering calculation of ocean color remote sensing uses the look-up table (LUT), which is usually created for a special remote sensor and cannot be applied to other sensors. For practical application, a general purpose Rayleigh scattering LUT which can be applied to all ocean color remote sensors is generated. An adding-doubling method to solve the vector radiative transfer equation in the plane-parallel atmosphere is deduced in detail. Compared with the exact Rayleigh scattering radiance derived from the MODIS exact Rayleigh scattering LUT, it is proved that the relative error of Rayleigh scattering calculation with the adding-doubling method is less than 0.25%, which meets the required accuracy of the atmospheric correction of ocean color remote sensing. Therefore, the adding-doubling method can be used to generate the exact Rayleigh scattering LUT for the ocean color remote sensors. Finally, the general purpose exact Rayleigh scattering LUT is generated using the adding-doubling method. On the basis of the general purpose LUT, the calculated Rayleigh scattering radiance is tested by comparing with the LUTs ofMODIS, SeaWiFS and the other ocean color sensors, showing that the relative errors are all less than 0.5%, and this general purpose LUT can be applied to all ocean color remote sensors.展开更多
Global look-up table strategy proposed recently has been proven to be an efficient method to accelerate the interpolation, which is the most time-consuming part in the iterative sub-pixel digital image correlation (...Global look-up table strategy proposed recently has been proven to be an efficient method to accelerate the interpolation, which is the most time-consuming part in the iterative sub-pixel digital image correlation (DIC) algorithms. In this paper, a global look-up table strategy with cubic B-spline interpolation is developed for the DIC method based on the inverse compositional Gauss-Newton (IC-GN) algorithm. The performance of this strategy, including accuracy, precision, and computation efficiency, is evaluated through a theoretical and experimental study, using the one with widely employed bicubic interpolation as a benchmark. The global look-up table strategy with cubic B-spline interpolation improves significantly the accuracy of the IC-GN algorithm-based DIC method compared with the one using the bicubic interpolation, at a trivial price of computation efficiency.展开更多
In order to precisely retrieve the atmospheric CO2 , a retrieval method based on both near infrared (NIR) and thermal infrared (TIR) is established firstly. Then a look-up-table (LUT) based fast line-by-line rad...In order to precisely retrieve the atmospheric CO2 , a retrieval method based on both near infrared (NIR) and thermal infrared (TIR) is established firstly. Then a look-up-table (LUT) based fast line-by-line radiative transfer model (RTM) was integrated into the retrieval procedure to accelerate radiative transfer calculations. The LUT stores gas absorption cross-sections as a function of temperature, pressure and wavenumber. It could greatly reduce calculating time in radiative transfer compared to direct line-by-line method. Then retrieval was simulated using NIR, TIR and both bands. The retrieved CO2 profiles suggest joint approach could reconstruct CO2 profile better than those using NIR or TIR alone. Joint retrieval using both bands simultaneously could provide better constrain to CO2 vertical distribution in the whole troposphere.展开更多
DSP operation in a Biomedical related therapeutic hardware need to beperformed with high accuracy and with high speed. Portable DSP hardware’s likepulse/heart beat detectors must perform with reduced operational powe...DSP operation in a Biomedical related therapeutic hardware need to beperformed with high accuracy and with high speed. Portable DSP hardware’s likepulse/heart beat detectors must perform with reduced operational power due to lack ofconventional power sources. This work proposes a hybrid biomedical hardware chip inwhich the speed and power utilization factors are greatly improved. Multipliers are thecore operational unit of any DSP SoC. This work proposes a LUT based unsignedmultiplication which is proven to be efficient in terms of high operating speed. For n bitinput multiplication n*n memory array of 2 n bit size is required to memorize all thepossible input and output combination. Various literature works claims to be achieve highspeed multiplication with reduced LUT size by integrating a barrel shifter mechanism.This paper work address this problem, by reworking the multiplier architecture with aparallel operating pre-processing unit which used to change the multiplier and multiplicandorder with respect to the number of computational addition and subtraction stages required.Along with LUT multiplier a low power bus encoding scheme is integrated to limit the powerconstraint of the on chip DSP unit. This paper address both the speed and power optimizationtechniques and tested with various FPGA device families.展开更多
The widespread application of new technologies,while empowering women with new opportunities,might also put them at disadvantage.For example,in comparison with males,the application of AI might be more likely to cost ...The widespread application of new technologies,while empowering women with new opportunities,might also put them at disadvantage.For example,in comparison with males,the application of AI might be more likely to cost them their jobs.Meanwhile,women are missing out the opportunity to participate in the policy-making process–they are absent from the table.If no change is made in the current policies,we will miss the goal to achieve gender equality,the fifth of the 17 Sustainable Development Goals set for UN’s 2030 Agenda for Sustainable Development,warned elite women scientists with the Organization for Women in Science for the Developing World(OWSD).What shall be done now?How can we make a difference?They are in action to help.展开更多
基金The Academic Colleges and Universities Innovation Program 2.0(No.BP0719013)。
文摘To solve the hardware deployment problem caused by the vast demanding computational complexity of convolutional layers and limited hardware resources for the hardware network inference,a look-up table(LUT)-based convolution architecture built on a field-programmable gate array using integer multipliers and addition trees is used.With the help of the Winograd algorithm,the optimization of convolution and multiplication is realized to reduce the computational complexity.The LUT-based operator is further optimized to construct a processing unit(PE).Simultaneously optimized storage streams improve memory access efficiency and solve bandwidth constraints.The data toggle rate is reduced to optimize power consumption.The experimental results show that the use of the Winograd algorithm to build basic processing units can significantly reduce the number of multipliers and achieve hardware deployment acceleration,while the time-division multiplexing of processing units improves resource utilization.Under this experimental condition,compared with the traditional convolution method,the architecture optimizes computing resources by 2.25 times and improves the peak throughput by 19.3 times.The LUT-based Winograd accelerator can effectively solve the deployment problem caused by limited hardware resources.
基金supported by the National Natural Science Foundation of China under contract No.40506036the High Tech Research and Development"863"Program of China under contract No.2003AA131160-04the Science and Technology Plan of Zhejiang Province of China under contract Nos 2004E60054 and 2004C13027.
文摘The current exact Rayleigh scattering calculation of ocean color remote sensing uses the look-up table (LUT), which is usually created for a special remote sensor and cannot be applied to other sensors. For practical application, a general purpose Rayleigh scattering LUT which can be applied to all ocean color remote sensors is generated. An adding-doubling method to solve the vector radiative transfer equation in the plane-parallel atmosphere is deduced in detail. Compared with the exact Rayleigh scattering radiance derived from the MODIS exact Rayleigh scattering LUT, it is proved that the relative error of Rayleigh scattering calculation with the adding-doubling method is less than 0.25%, which meets the required accuracy of the atmospheric correction of ocean color remote sensing. Therefore, the adding-doubling method can be used to generate the exact Rayleigh scattering LUT for the ocean color remote sensors. Finally, the general purpose exact Rayleigh scattering LUT is generated using the adding-doubling method. On the basis of the general purpose LUT, the calculated Rayleigh scattering radiance is tested by comparing with the LUTs ofMODIS, SeaWiFS and the other ocean color sensors, showing that the relative errors are all less than 0.5%, and this general purpose LUT can be applied to all ocean color remote sensors.
基金financially supported by the National Natural Science Foundation of China(11202081,11272124,and 11472109)the State Key Lab of Subtropical Building Science,South China University of Technology(2014ZC17)
文摘Global look-up table strategy proposed recently has been proven to be an efficient method to accelerate the interpolation, which is the most time-consuming part in the iterative sub-pixel digital image correlation (DIC) algorithms. In this paper, a global look-up table strategy with cubic B-spline interpolation is developed for the DIC method based on the inverse compositional Gauss-Newton (IC-GN) algorithm. The performance of this strategy, including accuracy, precision, and computation efficiency, is evaluated through a theoretical and experimental study, using the one with widely employed bicubic interpolation as a benchmark. The global look-up table strategy with cubic B-spline interpolation improves significantly the accuracy of the IC-GN algorithm-based DIC method compared with the one using the bicubic interpolation, at a trivial price of computation efficiency.
基金Supported by the National Natural Science Foundation of China(41175037)
文摘In order to precisely retrieve the atmospheric CO2 , a retrieval method based on both near infrared (NIR) and thermal infrared (TIR) is established firstly. Then a look-up-table (LUT) based fast line-by-line radiative transfer model (RTM) was integrated into the retrieval procedure to accelerate radiative transfer calculations. The LUT stores gas absorption cross-sections as a function of temperature, pressure and wavenumber. It could greatly reduce calculating time in radiative transfer compared to direct line-by-line method. Then retrieval was simulated using NIR, TIR and both bands. The retrieved CO2 profiles suggest joint approach could reconstruct CO2 profile better than those using NIR or TIR alone. Joint retrieval using both bands simultaneously could provide better constrain to CO2 vertical distribution in the whole troposphere.
文摘DSP operation in a Biomedical related therapeutic hardware need to beperformed with high accuracy and with high speed. Portable DSP hardware’s likepulse/heart beat detectors must perform with reduced operational power due to lack ofconventional power sources. This work proposes a hybrid biomedical hardware chip inwhich the speed and power utilization factors are greatly improved. Multipliers are thecore operational unit of any DSP SoC. This work proposes a LUT based unsignedmultiplication which is proven to be efficient in terms of high operating speed. For n bitinput multiplication n*n memory array of 2 n bit size is required to memorize all thepossible input and output combination. Various literature works claims to be achieve highspeed multiplication with reduced LUT size by integrating a barrel shifter mechanism.This paper work address this problem, by reworking the multiplier architecture with aparallel operating pre-processing unit which used to change the multiplier and multiplicandorder with respect to the number of computational addition and subtraction stages required.Along with LUT multiplier a low power bus encoding scheme is integrated to limit the powerconstraint of the on chip DSP unit. This paper address both the speed and power optimizationtechniques and tested with various FPGA device families.
文摘The widespread application of new technologies,while empowering women with new opportunities,might also put them at disadvantage.For example,in comparison with males,the application of AI might be more likely to cost them their jobs.Meanwhile,women are missing out the opportunity to participate in the policy-making process–they are absent from the table.If no change is made in the current policies,we will miss the goal to achieve gender equality,the fifth of the 17 Sustainable Development Goals set for UN’s 2030 Agenda for Sustainable Development,warned elite women scientists with the Organization for Women in Science for the Developing World(OWSD).What shall be done now?How can we make a difference?They are in action to help.