Memristors are now becoming a prominent candidate to serve as the building blocks of non-von Neumann inmemory computing architectures.By mapping analog numerical matrices into memristor crossbar arrays,efficient multi...Memristors are now becoming a prominent candidate to serve as the building blocks of non-von Neumann inmemory computing architectures.By mapping analog numerical matrices into memristor crossbar arrays,efficient multiply accumulate operations can be performed in a massively parallel fashion using the physics mechanisms of Ohm’s law and Kirchhoff’s law.In this brief review,we present the recent progress in two niche applications:neural network accelerators and numerical computing units,mainly focusing on the advances in hardware demonstrations.The former one is regarded as soft computing since it can tolerant some degree of the device and array imperfections.The acceleration of multiple layer perceptrons,convolutional neural networks,generative adversarial networks,and long short-term memory neural networks are described.The latter one is hard computing because the solving of numerical problems requires high-precision devices.Several breakthroughs in memristive equation solvers with improved computation accuracies are highlighted.Besides,other nonvolatile devices with the capability of analog computing are also briefly introduced.Finally,we conclude the review with discussions on the challenges and opportunities for future research toward realizing memristive analog computing machines.展开更多
This study presents a new method of 4-pipelined high-performance split multiply-accumulator (MAC) architecture, which is capable of supporting multiple precisions developed for media processors. To speed up the design...This study presents a new method of 4-pipelined high-performance split multiply-accumulator (MAC) architecture, which is capable of supporting multiple precisions developed for media processors. To speed up the design further, a novel partial product compression circuit based on interleaved adders and a modified hybrid partial product reduction tree (PPRT) scheme are proposed. The MAC can perform 1-way 32-bit, 4-way 16-bit signed/unsigned multiply or multiply-accumulate operations and 2-way parallel multiply add (PMADD) operations at a high frequency of 1.25 GHz under worst-case conditions and 1.67 GHz under typical-case conditions, respectively. Compared with the MAC in 32-bit microprocessor without interlocked piped stages (MIPS), the proposed design shows a great advantage in speed. Moreover, an improvement of up to 32% in throughput is achieved. The MAC design has been fabricated with Taiwan Semiconductor Manufacturing Company (TSMC) 90-nm CMOS standard cell technology and has passed a functional test.展开更多
基金the National Key Research and Development Plan of MOST of China(2019YFB2205100,2016YFA0203800)the National Natural Science Foundation of China(No.61874164,61841404,51732003,61674061)Hubei Engineering Research Center on Microelectronics.
文摘Memristors are now becoming a prominent candidate to serve as the building blocks of non-von Neumann inmemory computing architectures.By mapping analog numerical matrices into memristor crossbar arrays,efficient multiply accumulate operations can be performed in a massively parallel fashion using the physics mechanisms of Ohm’s law and Kirchhoff’s law.In this brief review,we present the recent progress in two niche applications:neural network accelerators and numerical computing units,mainly focusing on the advances in hardware demonstrations.The former one is regarded as soft computing since it can tolerant some degree of the device and array imperfections.The acceleration of multiple layer perceptrons,convolutional neural networks,generative adversarial networks,and long short-term memory neural networks are described.The latter one is hard computing because the solving of numerical problems requires high-precision devices.Several breakthroughs in memristive equation solvers with improved computation accuracies are highlighted.Besides,other nonvolatile devices with the capability of analog computing are also briefly introduced.Finally,we conclude the review with discussions on the challenges and opportunities for future research toward realizing memristive analog computing machines.
基金Project (No. 60873112) supported by the National Natural Science Foundation of China
文摘This study presents a new method of 4-pipelined high-performance split multiply-accumulator (MAC) architecture, which is capable of supporting multiple precisions developed for media processors. To speed up the design further, a novel partial product compression circuit based on interleaved adders and a modified hybrid partial product reduction tree (PPRT) scheme are proposed. The MAC can perform 1-way 32-bit, 4-way 16-bit signed/unsigned multiply or multiply-accumulate operations and 2-way parallel multiply add (PMADD) operations at a high frequency of 1.25 GHz under worst-case conditions and 1.67 GHz under typical-case conditions, respectively. Compared with the MAC in 32-bit microprocessor without interlocked piped stages (MIPS), the proposed design shows a great advantage in speed. Moreover, an improvement of up to 32% in throughput is achieved. The MAC design has been fabricated with Taiwan Semiconductor Manufacturing Company (TSMC) 90-nm CMOS standard cell technology and has passed a functional test.
基金the National Key Research and Development Program of China(2016YFA0203900,2018YFB2202500)Innovation Program of Shanghai Municipal Education Commission(2021-01-07-00-07-E00077)+3 种基金Shanghai Municipal Science and Technology Commission(18JC1410300,21DZ1100900)Research Grant Council of Hong Kong(15205619)the National Natural Science Foundation of China(61925402,61934008,and 6210030233)the Natural Science Foundation of Shanghai(21ZR1405700)。