期刊文献+

基于SRAM缓存和存内计算的低功耗关键词唤醒系统

A low-power keyword spotting system with SRAM buffer and computing-in-memory
下载PDF
导出
摘要 为了解决关键词唤醒算法部署在边缘计算硬件会带来较高功耗、给电池驱动的设备带来续航挑战的问题,提出了一种基于存内计算技术和软硬件协同优化的低功耗关键词唤醒系统。在算法层面,基于标准MFCC算法拓扑结构提出了一种三值量化MFCC-CNN联合算法,将MFCC中的全部通用矩阵乘映射到神经网络加速器当中。在电路层面,提出了一种基于SRAM的存内计算核心,用于解决传统冯·诺依曼架构加速器存在的功耗墙和存储墙问题。同时通过复用存内计算核心的SRAM存储功能提出了一种基于查找表实现的缓存电路,用于替代寄存器延迟链电路。SRAM存内计算核心和SRAM缓存电路均采用定制单元实现。在系统层面,基于以上2种定制电路设计了一种低功耗关键词唤醒系统。该系统采用ASIC与定制化电路设计流程设计,并使用28 nm CMOS工艺库对该设计进行了ASIC综合,在250 kHz下,关键词唤醒系统运行10分类任务的延迟是64 ms,整体功耗为645.28μW,其中MFCC流水线的动态功耗占总动态功耗的5.9%,总功耗仅占系统功耗的1.3%。 This paper proposes a low-power keyword spotting(KWS)system to overcome the problem of high-power consumption caused by deploying KWS algorithms on edge computing hardware,which can significantly impact the endurance of mobile devices.The proposed KWS system is based on computing-in-memory(CIM)technology and software-hardware co-design.In terms of algorithm,a ternary quantized MFCC-CNN joint algorithm based on the standard MFCC algorithm topology is proposed.All the general matrix multiplication(GEMM)in MFCC is mapped to the neural network accelerator.At the circuit level,the proposed system uses a computing-in-memory(CIM)core based on SRAM to overcome the power and memory walls in traditional von Neumann architecture accelerators.Additionally,a SRAM buffer circuit based on a look-up table is proposed to replace the register delay chain,which multiplexes the memory array in the CIM core.Both the SRAM-based CIM core and buffer are implemented using custom circuit units.At the system level,a low-power KWS system is proposed utilizing the two customized circuits discussed above.The system is implemented using ASIC and customized circuit design methods and synthesized using a 28 nm process library.The proposed system achieves a processing delay of 64 ms on 10 classification tasks,with a total power consumption of 645.28μW.The dynamic power consumption of the MFCC pipeline accounts for 5.9%of the total dynamic power consumption,and the total power consumption accounts for only 1.3%of the system's power consumption.
作者 黄至锐 贾心茹 朱浩哲 陈迟晓 HUANG Zhi-rui;JIA Xin-ru;ZHU Hao-zhe;CHEN Chi-xiao(State Key Laboratory of Integrated Chips and Systems,Fudan University,Shanghai 200433;Frontier Institute of Chip and System,Fudan University,Shanghai 200438,China)
出处 《计算机工程与科学》 CSCD 北大核心 2024年第8期1331-1339,共9页 Computer Engineering & Science
基金 国家重点研发计划(2022YFB4500101)。
关键词 关键词唤醒 三值量化神经网络 存内计算 串行快速傅里叶变换 软硬件协同设计 keyword spotting ternary quantized neural network computing-in-memory serial fast Fourier transform(FFT) software-hardware co-design

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部