摘要
存内计算(Computing In Memory,CIM)在人工智能神经网络的卷积运算方面具有巨大的应用潜力。基于忆阻器阵列的多位存内计算由于具备写入速度快、与互补金属氧化物半导体(Complementary Metal Oxide Semiconductor,CMOS)工艺兼容等特点,有望成为解决“内存墙”的有效手段。然而,当前多位存内计算电路架构面临输出延时高和能耗大的问题,主要原因为传统感知放大器的性能制约,为此本文提出了一种低延时低能耗多位电流型感知放大器(Low-delay Low-power Multi-bit Current-mode Sense Amplifier,LLM-CSA),通过减少传统CSA电路工作状态数量、简化工作时序来优化功能;采用新型低位检测模块的电路设计思路,来多层次系统性地降低输出延时并优化能耗。使用中芯国际40 nm低漏电逻辑工艺(SMIC40 nm LL),利用Cadence电路设计平台,仿真验证所提LLM-CSA的功能和延时-能耗性能。通过对比分析发现:LLM-CSA比传统CSA输出延时降低1.42倍,能量消耗降低1.56倍。进一步地,以一种4 bit输入、4 bit权重、11 bit输出的忆阻器阵列多位存内计算架构为应用,对比验证所提LLM-CSA的性能:与基于传统CSA的存内计算系统相比,新架构延时降低1.18倍,能耗降低1.03倍。LLM-CSA的提出对促进感知放大器设计思路和忆阻器阵列存内计算架构的发展,具有一定的理论和现实意义。
Computing-In-Memory(CIM)has been proved to have great potential for convolution of Artificial Intelligent(AI)neural networks.Among them,memory array-based multi-bit memory computing is expected to become an effective means to solve the"memory wall"due to its writing speed and compatibility with Complementary Metal Oxide Semiconductor(CMOS)process.However,the current multi-bit in-memory computing circuit architecture is facing the problem of high output delay and high energy consumption,mainly due to the performance constraints of traditional sensitive amplifiers.This paper presents a Low-delay Low-power Multi-bit Current-mode Sense Amplifier(LLM-CSA),through the function optimization strategy of reducing the number of working states of the traditional CSA circuit and simplifying the working sequence,and adopting the circuit design idea of the new low detection module,the output delay can be systematically reduced and the energy consumption can be optimized at multiple levels.The function and delayenergy consumption of the proposed LLM-CSA are simulated and validated by using the Midcore International 40 nm low leakage logic process(SMIC40 nm LL)and the Cadence circuit design platform.Through comparative analysis,it is found that the output delay of LLM-CSA is 1.42 times lower and the energy consumption is 1.56 times lower than that of traditional CSA.Further,the performance of the proposed LLM-CSA is compared and verified with a memory array multibit in-memory computing architecture with 4 bit input,4 bit weight and 11 bit output:the new architecture has 1.18 times lower latency and 1.03 times lower energy consumption than the traditional CSA-based in-memory computing system.The proposal of LLM-CSA has certain theoretical and practical significance to promote the design idea of sensitive amplifier and the development of memory array memory computing architecture.
作者
唐成峰
胡炜
TANG Chengfeng;HU Wei(College of Physics and Information Engineering,Fuzhou University,Fuzhou 350116,China)
出处
《微电子学与计算机》
2024年第2期58-66,共9页
Microelectronics & Computer
基金
国家自然科学基金面上项目(62274036)
福建省自然科学基金面上项目(2022J01079)。
关键词
忆阻器阵列
存内计算
电流型感知放大器
低延时低能耗
RRAM array
computing-in-memory
current-mode sense amplifier
low latency and low power