With the continuous development of deep learning,Deep Convolutional Neural Network(DCNN)has attracted wide attention in the industry due to its high accuracy in image classification.Compared with other DCNN hard-ware ...With the continuous development of deep learning,Deep Convolutional Neural Network(DCNN)has attracted wide attention in the industry due to its high accuracy in image classification.Compared with other DCNN hard-ware deployment platforms,Field Programmable Gate Array(FPGA)has the advantages of being programmable,low power consumption,parallelism,and low cost.However,the enormous amount of calculation of DCNN and the limited logic capacity of FPGA restrict the energy efficiency of the DCNN accelerator.The traditional sequential sliding window method can improve the throughput of the DCNN accelerator by data multiplexing,but this method’s data multiplexing rate is low because it repeatedly reads the data between rows.This paper proposes a fast data readout strategy via the circular sliding window data reading method,it can improve the multiplexing rate of data between rows by optimizing the memory access order of input data.In addition,the multiplication bit width of the DCNN accelerator is much smaller than that of the Digital Signal Processing(DSP)on the FPGA,which means that there will be a waste of resources if a multiplication uses a single DSP.A multiplier sharing strategy is proposed,the multiplier of the accelerator is customized so that a single DSP block can complete multiple groups of 4,6,and 8-bit signed multiplication in parallel.Finally,based on two strategies of appeal,an FPGA optimized accelerator is proposed.The accelerator is customized by Verilog language and deployed on Xilinx VCU118.When the accelerator recognizes the CIRFAR-10 dataset,its energy efficiency is 39.98 GOPS/W,which provides 1.73×speedup energy efficiency over previous DCNN FPGA accelerators.When the accelerator recognizes the IMAGENET dataset,its energy efficiency is 41.12 GOPS/W,which shows 1.28×−3.14×energy efficiency compared with others.展开更多
With increasing physical event rates and the number of electronic channels, traditional readout schemes meet the challenge of improving readout speed caused by the limited bandwidth of the crate backplane. In this pap...With increasing physical event rates and the number of electronic channels, traditional readout schemes meet the challenge of improving readout speed caused by the limited bandwidth of the crate backplane. In this paper, a high-speed data readout method based on the Ethernet is presented to make each readout module capable of transmitting data to the DAQ. Features of explicitly parallel data transmitting and distributed network architecture give the readout system the advantage of adapting varying requirements of particle physics experiments. Furthermore,to guarantee the readout performance and flexibility, a standalone embedded CPU system is utilized for network protocol stack processing. To receive the customized data format and protocol from front-end electronics, a field programmable gate array(FPGA) is used for logic reconfiguration. To optimize the interface and to improve the data throughput between CPU and FPGA, a sophisticated method based on SRAM is presented in this paper. For the purpose of evaluating this high-speed readout method, a simplified readout module is designed and implemented.Test results show that this module can support up to 70 Mbps data throughput from the readout module to DAQ.展开更多
Polychromatic reconstruction(PCR) is a novel nondestructive readout method that utilizes a spectrally broad light source for the probe beam. The stored image can be completely reconstructed even though the probe wavel...Polychromatic reconstruction(PCR) is a novel nondestructive readout method that utilizes a spectrally broad light source for the probe beam. The stored image can be completely reconstructed even though the probe wavelength is very different from the recording one. The large spectral width of the polychromatic probe beam also causes adverse effect on the storage density. But this can be overcomed if an additional optical component is inserted in the imaging system. Therefore, PCR has a great potential to achieve nondestructive readout and large storage density simultaneously. In addition, PCR enables us to design a simple memory system because its readout tolerance is quite large as compared to the conventional monochromatic readout. Unique and attractive features of the polychromatic reconstruction method was theoretically and experimentally demonstrated.展开更多
Readout electronics is developed for a prototype spectrometer for in situ measurement of low-energy ions of30 e V/e–20 ke V/e in the solar wind plasma.A low-noise preamplifier/discriminator(A111F) is employed for eac...Readout electronics is developed for a prototype spectrometer for in situ measurement of low-energy ions of30 e V/e–20 ke V/e in the solar wind plasma.A low-noise preamplifier/discriminator(A111F) is employed for each channel to process the signal from micro-channel plate(MCP) detectors.A high-voltage(HV) supply solution based on a HV module and a HV optocoupler is adopted to generate a fast sweeping HV and a fixed HV.Due to limitation of telemetry bandwidth in space communication,an algorithm is implemented in an FPGA(field programmable gate array) to compress the raw data.Test results show that the electronics achieves a 1 MHz event rate and a large input dynamic range of 95 p C.A slew rate of 0.8 V/ls and an integral nonlinearity of 0.7-LSB for the sweeping HV,and a precision of less than 0.8 % for the fixed HV are obtained.A vacuum beam test shows an energy resolution of 12 ± 0.7 % full width at half maximum(FWHM) is achieved,and noise counts are less than10/sec,indicating that the performance meets the physical requirement.展开更多
Four different states of Si15Sb85 and Ge2Sb2Te5 phase change memory thin films are obtained by crystallization degree modulation through laser initialization at different powers or annealing at different temperatures....Four different states of Si15Sb85 and Ge2Sb2Te5 phase change memory thin films are obtained by crystallization degree modulation through laser initialization at different powers or annealing at different temperatures. The polarization characteristics of these two four-level phase change recording media are analyzed systematically. A simple and effective readout scheme is then proposed, and the readout signal is numerically simulated. The results show that a high-contrast polarization readout can be obtained in an extensive wavelength range for the four-level phase change recording media using common phase change materials. This study will help in-depth understanding of the physical mechanisms and provide technical approaches to multilevel phase change recording.展开更多
基金supported in part by the Major Program of the Ministry of Science and Technology of China under Grant 2019YFB2205102in part by the National Natural Science Foundation of China under Grant 61974164,62074166,61804181,62004219,62004220,62104256.
文摘With the continuous development of deep learning,Deep Convolutional Neural Network(DCNN)has attracted wide attention in the industry due to its high accuracy in image classification.Compared with other DCNN hard-ware deployment platforms,Field Programmable Gate Array(FPGA)has the advantages of being programmable,low power consumption,parallelism,and low cost.However,the enormous amount of calculation of DCNN and the limited logic capacity of FPGA restrict the energy efficiency of the DCNN accelerator.The traditional sequential sliding window method can improve the throughput of the DCNN accelerator by data multiplexing,but this method’s data multiplexing rate is low because it repeatedly reads the data between rows.This paper proposes a fast data readout strategy via the circular sliding window data reading method,it can improve the multiplexing rate of data between rows by optimizing the memory access order of input data.In addition,the multiplication bit width of the DCNN accelerator is much smaller than that of the Digital Signal Processing(DSP)on the FPGA,which means that there will be a waste of resources if a multiplication uses a single DSP.A multiplier sharing strategy is proposed,the multiplier of the accelerator is customized so that a single DSP block can complete multiple groups of 4,6,and 8-bit signed multiplication in parallel.Finally,based on two strategies of appeal,an FPGA optimized accelerator is proposed.The accelerator is customized by Verilog language and deployed on Xilinx VCU118.When the accelerator recognizes the CIRFAR-10 dataset,its energy efficiency is 39.98 GOPS/W,which provides 1.73×speedup energy efficiency over previous DCNN FPGA accelerators.When the accelerator recognizes the IMAGENET dataset,its energy efficiency is 41.12 GOPS/W,which shows 1.28×−3.14×energy efficiency compared with others.
基金Supported by National Natural Science Foundation of China(11005107)Independent Projects of State Key Laboratory of Particle Detection and Electronics(201301)
文摘With increasing physical event rates and the number of electronic channels, traditional readout schemes meet the challenge of improving readout speed caused by the limited bandwidth of the crate backplane. In this paper, a high-speed data readout method based on the Ethernet is presented to make each readout module capable of transmitting data to the DAQ. Features of explicitly parallel data transmitting and distributed network architecture give the readout system the advantage of adapting varying requirements of particle physics experiments. Furthermore,to guarantee the readout performance and flexibility, a standalone embedded CPU system is utilized for network protocol stack processing. To receive the customized data format and protocol from front-end electronics, a field programmable gate array(FPGA) is used for logic reconfiguration. To optimize the interface and to improve the data throughput between CPU and FPGA, a sophisticated method based on SRAM is presented in this paper. For the purpose of evaluating this high-speed readout method, a simplified readout module is designed and implemented.Test results show that this module can support up to 70 Mbps data throughput from the readout module to DAQ.
文摘Polychromatic reconstruction(PCR) is a novel nondestructive readout method that utilizes a spectrally broad light source for the probe beam. The stored image can be completely reconstructed even though the probe wavelength is very different from the recording one. The large spectral width of the polychromatic probe beam also causes adverse effect on the storage density. But this can be overcomed if an additional optical component is inserted in the imaging system. Therefore, PCR has a great potential to achieve nondestructive readout and large storage density simultaneously. In addition, PCR enables us to design a simple memory system because its readout tolerance is quite large as compared to the conventional monochromatic readout. Unique and attractive features of the polychromatic reconstruction method was theoretically and experimentally demonstrated.
基金supported by the National Key Scientific Instrument and Equipment Development Projects of the National Natural Science Foundation of China(No.41327802)the Fundamental Research Funds for the Central Universities(WK2030040066)
文摘Readout electronics is developed for a prototype spectrometer for in situ measurement of low-energy ions of30 e V/e–20 ke V/e in the solar wind plasma.A low-noise preamplifier/discriminator(A111F) is employed for each channel to process the signal from micro-channel plate(MCP) detectors.A high-voltage(HV) supply solution based on a HV module and a HV optocoupler is adopted to generate a fast sweeping HV and a fixed HV.Due to limitation of telemetry bandwidth in space communication,an algorithm is implemented in an FPGA(field programmable gate array) to compress the raw data.Test results show that the electronics achieves a 1 MHz event rate and a large input dynamic range of 95 p C.A slew rate of 0.8 V/ls and an integral nonlinearity of 0.7-LSB for the sweeping HV,and a precision of less than 0.8 % for the fixed HV are obtained.A vacuum beam test shows an energy resolution of 12 ± 0.7 % full width at half maximum(FWHM) is achieved,and noise counts are less than10/sec,indicating that the performance meets the physical requirement.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.61178059 and 61137002)the Key Program of the Science and Technology Commission of Shanghai Municipality,China(Grant No.11jc1413300)
文摘Four different states of Si15Sb85 and Ge2Sb2Te5 phase change memory thin films are obtained by crystallization degree modulation through laser initialization at different powers or annealing at different temperatures. The polarization characteristics of these two four-level phase change recording media are analyzed systematically. A simple and effective readout scheme is then proposed, and the readout signal is numerically simulated. The results show that a high-contrast polarization readout can be obtained in an extensive wavelength range for the four-level phase change recording media using common phase change materials. This study will help in-depth understanding of the physical mechanisms and provide technical approaches to multilevel phase change recording.