With the rapid development of machine learning,the demand for high-efficient computing becomes more and more urgent.To break the bottleneck of the traditional Von Neumann architecture,computing-in-memory(CIM)has attra...With the rapid development of machine learning,the demand for high-efficient computing becomes more and more urgent.To break the bottleneck of the traditional Von Neumann architecture,computing-in-memory(CIM)has attracted increasing attention in recent years.In this work,to provide a feasible CIM solution for the large-scale neural networks(NN)requiring continuous weight updating in online training,a flash-based computing-in-memory with high endurance(10^(9) cycles)and ultrafast programming speed is investigated.On the one hand,the proposed programming scheme of channel hot electron injection(CHEI)and hot hole injection(HHI)demonstrate high linearity,symmetric potentiation,and a depression process,which help to improve the training speed and accuracy.On the other hand,the low-damage programming scheme and memory window(MW)optimizations can suppress cell degradation effectively with improved computing accuracy.Even after 109 cycles,the leakage current(I_(off))of cells remains sub-10pA,ensuring the large-scale computing ability of memory.Further characterizations are done on read disturb to demonstrate its robust reliabilities.By processing CIFAR-10 tasks,it is evident that~90%accuracy can be achieved after 109 cycles in both ResNet50 and VGG16 NN.Our results suggest that flash-based CIM has great potential to overcome the limitations of traditional Von Neumann architectures and enable high-performance NN online training,which pave the way for further development of artificial intelligence(AI)accelerators.展开更多
The“memory wall”of traditional von Neumann computing systems severely restricts the efficiency of data-intensive task execution,while in-memory computing(IMC)architecture is a promising approach to breaking the bott...The“memory wall”of traditional von Neumann computing systems severely restricts the efficiency of data-intensive task execution,while in-memory computing(IMC)architecture is a promising approach to breaking the bottleneck.Although variations and instability in ultra-scaled memory cells seriously degrade the calculation accuracy in IMC architectures,stochastic computing(SC)can compensate for these shortcomings due to its low sensitivity to cell disturbances.Furthermore,massive parallel computing can be processed to improve the speed and efficiency of the system.In this paper,by designing logic functions in NOR flash arrays,SC in IMC for the image edge detection is realized,demonstrating ultra-low computational complexity and power consumption(25.5 fJ/pixel at 2-bit sequence length).More impressively,the noise immunity is 6 times higher than that of the traditional binary method,showing good tolerances to cell variation and reliability degradation when implementing massive parallel computation in the array.展开更多
Cold-source field-effect transistors(CS-FETs)have been developed to overcome the major challenge of power dissipation in modern integrated circuits.Cold metals suitable for n-type CS-FETs have been proposed as the ide...Cold-source field-effect transistors(CS-FETs)have been developed to overcome the major challenge of power dissipation in modern integrated circuits.Cold metals suitable for n-type CS-FETs have been proposed as the ideal electrode to filter the high-energy electrons and break the thermal limit on subthreshold swing(SS).In this work,regarding the p-type CS-FETs,we propose TcX_(2) and ReX_(2)(X=S,Se)as the injection source to realize the sub-thermal switching for holes.First-principles calculations unveils the cold-metal characteristics of monolayer TcX_(2) and ReX_(2),possessing a sub-gap below the Fermi level and a decreasing DOS with energy.Quantum device simulations demonstrate that TcX_(2) and ReX_(2) can enable the cold source effects in WSe_(2) p-type FETs,achieving steep SS of 29-38 mV/dec and high on/off ratios of(2.3-5.6)×10^(7).Moreover,multilayer Re S2retains the cold metal characteristic,thus ensuring similar CS-FET performances to that of the monolayer source.This work underlines the significance of cold metals for the design of p-type CS-FETs.展开更多
基金This work was supported by the National Natural Science Foundation of China(Nos.62034006,92264201,and 91964105)the Natural Science Foundation of Shandong Province(Nos.ZR2020JQ28 and ZR2020KF016)the Program of Qilu Young Scholars of Shandong University.
文摘With the rapid development of machine learning,the demand for high-efficient computing becomes more and more urgent.To break the bottleneck of the traditional Von Neumann architecture,computing-in-memory(CIM)has attracted increasing attention in recent years.In this work,to provide a feasible CIM solution for the large-scale neural networks(NN)requiring continuous weight updating in online training,a flash-based computing-in-memory with high endurance(10^(9) cycles)and ultrafast programming speed is investigated.On the one hand,the proposed programming scheme of channel hot electron injection(CHEI)and hot hole injection(HHI)demonstrate high linearity,symmetric potentiation,and a depression process,which help to improve the training speed and accuracy.On the other hand,the low-damage programming scheme and memory window(MW)optimizations can suppress cell degradation effectively with improved computing accuracy.Even after 109 cycles,the leakage current(I_(off))of cells remains sub-10pA,ensuring the large-scale computing ability of memory.Further characterizations are done on read disturb to demonstrate its robust reliabilities.By processing CIFAR-10 tasks,it is evident that~90%accuracy can be achieved after 109 cycles in both ResNet50 and VGG16 NN.Our results suggest that flash-based CIM has great potential to overcome the limitations of traditional Von Neumann architectures and enable high-performance NN online training,which pave the way for further development of artificial intelligence(AI)accelerators.
基金supported by the National Natural Science Foundation of China(Nos.62034006,91964105,61874068)the China Key Research and Development Program(No.2016YFA0201802)+1 种基金the Natural Science Foundation of Shandong Province(No.ZR2020JQ28)Program of Qilu Young Scholars of Shandong University。
文摘The“memory wall”of traditional von Neumann computing systems severely restricts the efficiency of data-intensive task execution,while in-memory computing(IMC)architecture is a promising approach to breaking the bottleneck.Although variations and instability in ultra-scaled memory cells seriously degrade the calculation accuracy in IMC architectures,stochastic computing(SC)can compensate for these shortcomings due to its low sensitivity to cell disturbances.Furthermore,massive parallel computing can be processed to improve the speed and efficiency of the system.In this paper,by designing logic functions in NOR flash arrays,SC in IMC for the image edge detection is realized,demonstrating ultra-low computational complexity and power consumption(25.5 fJ/pixel at 2-bit sequence length).More impressively,the noise immunity is 6 times higher than that of the traditional binary method,showing good tolerances to cell variation and reliability degradation when implementing massive parallel computation in the array.
基金supported by the National Natural Science Foundation of China (Grant Nos.62034006,92264201,and 62104134)the Natural Science Foundation of Shandong Province of China (Grant Nos.ZR2023QF076 and ZR2023QF054)。
文摘Cold-source field-effect transistors(CS-FETs)have been developed to overcome the major challenge of power dissipation in modern integrated circuits.Cold metals suitable for n-type CS-FETs have been proposed as the ideal electrode to filter the high-energy electrons and break the thermal limit on subthreshold swing(SS).In this work,regarding the p-type CS-FETs,we propose TcX_(2) and ReX_(2)(X=S,Se)as the injection source to realize the sub-thermal switching for holes.First-principles calculations unveils the cold-metal characteristics of monolayer TcX_(2) and ReX_(2),possessing a sub-gap below the Fermi level and a decreasing DOS with energy.Quantum device simulations demonstrate that TcX_(2) and ReX_(2) can enable the cold source effects in WSe_(2) p-type FETs,achieving steep SS of 29-38 mV/dec and high on/off ratios of(2.3-5.6)×10^(7).Moreover,multilayer Re S2retains the cold metal characteristic,thus ensuring similar CS-FET performances to that of the monolayer source.This work underlines the significance of cold metals for the design of p-type CS-FETs.