期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Design and implementation of dual-mode configurable memory architecture for CNN accelerator
1
作者 山蕊 LI Xiaoshuo +1 位作者 GAO Xu huo ziqing 《High Technology Letters》 EI CAS 2024年第2期211-220,共10页
With the rapid development of deep learning algorithms,the computational complexity and functional diversity are increasing rapidly.However,the gap between high computational density and insufficient memory bandwidth ... With the rapid development of deep learning algorithms,the computational complexity and functional diversity are increasing rapidly.However,the gap between high computational density and insufficient memory bandwidth under the traditional von Neumann architecture is getting worse.Analyzing the algorithmic characteristics of convolutional neural network(CNN),it is found that the access characteristics of convolution(CONV)and fully connected(FC)operations are very different.Based on this feature,a dual-mode reronfigurable distributed memory architecture for CNN accelerator is designed.It can be configured in Bank mode or first input first output(FIFO)mode to accommodate the access needs of different operations.At the same time,a programmable memory control unit is designed,which can effectively control the dual-mode configurable distributed memory architecture by using customized special accessing instructions and reduce the data accessing delay.The proposed architecture is verified and tested by parallel implementation of some CNN algorithms.The experimental results show that the peak bandwidth can reach 13.44 GB·s^(-1)at an operating frequency of 120 MHz.This work can achieve 1.40,1.12,2.80 and 4.70 times the peak bandwidth compared with the existing work. 展开更多
关键词 distributed memory structure neural network accelerator reconfigurable arrayprocessor configurable memory structure
下载PDF
基于近存储计算的手写数字识别实时检测阵列结构设计
2
作者 霍紫晴 山蕊 +2 位作者 冯雅妮 高旭 冯煜 《光电子.激光》 CAS CSCD 北大核心 2022年第12期1315-1322,共8页
卷积神经网络(convolutional neural network, CNN)作为传统神经网络的改进,已经得到了广泛的应用。然而,在CNN性能提升的同时其模型的规模不断扩大,对存储及算力的要求越来越高,基于冯·诺依曼体系结构的处理器难以达到令人满意的... 卷积神经网络(convolutional neural network, CNN)作为传统神经网络的改进,已经得到了广泛的应用。然而,在CNN性能提升的同时其模型的规模不断扩大,对存储及算力的要求越来越高,基于冯·诺依曼体系结构的处理器难以达到令人满意的高处理性能。为了提升系统性能,近存储计算(near memory computing, NMC)成为了一个具有发展前景的研究方向。本文利用一种支持NMC的可重构阵列处理器实现手写数字识别,并行地实现了卷积运算;同时利用共享缓存阵列结构,减少片外存储的频繁访问。实验结果表明,在110 MHz的工作频率下,执行单个5×5卷积运算的计算速度提升了75.00%,可以在9 960μs内实现一个手写数字的识别。 展开更多
关键词 卷积神经网络(CNN) 手写数字识别 可重构阵列处理器 近存储计算(NMC) 共享缓存阵列
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部