期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Design and Tool Flow of a Reconfigurable Asynchronous Neural Network Accelerator 被引量:3
1
作者 Jilin Zhang Hui Wu +2 位作者 Weijia Chen Shaojun Wei Hong Chen 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2021年第5期565-573,共9页
Convolutional Neural Networks(CNNs)are widely used in computer vision,natural language processing,and so on,which generally require low power and high efficiency in real applications.Thus,energy efficiency has become ... Convolutional Neural Networks(CNNs)are widely used in computer vision,natural language processing,and so on,which generally require low power and high efficiency in real applications.Thus,energy efficiency has become a critical indicator of CNN accelerators.Considering that asynchronous circuits have the advantages of low power consumption,high speed,and no clock distribution problems,we design and implement an energy-efficient asynchronous CNN accelerator with a 65 nm Complementary Metal Oxide Semiconductor(CMOS)process.Given the absence of a commercial design tool flow for asynchronous circuits,we develop a novel design flow to implement Click-based asynchronous bundled data circuits efficiently to mask layout with conventional Electronic Design Automation(EDA)tools.We also introduce an adaptive delay matching method and perform accurate static timing analysis for the circuits to ensure correct timing.The accelerator for handwriting recognition network(LeNet-5 model)is implemented.Silicon test results show that the asynchronous accelerator has 30%less power in computing array than the synchronous one and that the energy efficiency of the asynchronous accelerator achieves 1.538 TOPS/W,which is 12%higher than that of the synchronous chip. 展开更多
关键词 convolutional neural network(CNN)accelerator asynchronous circuit energy efficiency adaptive delay matching asynchronous design flow
原文传递
Design of high parallel CNN accelerator based on FPGA for AIoT
2
作者 Lin Zhijian Gao Xuewei +3 位作者 Chen Xiaopei Zhu Zhipeng Du Xiaoyong Chen Pingping 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2022年第5期1-9,61,共10页
To tackle the challenge of applying convolutional neural network(CNN)in field-programmable gate array(FPGA)due to its computational complexity,a high-performance CNN hardware accelerator based on Verilog hardware desc... To tackle the challenge of applying convolutional neural network(CNN)in field-programmable gate array(FPGA)due to its computational complexity,a high-performance CNN hardware accelerator based on Verilog hardware description language was designed,which utilizes a pipeline architecture with three parallel dimensions including input channels,output channels,and convolution kernels.Firstly,two multiply-and-accumulate(MAC)operations were packed into one digital signal processing(DSP)block of FPGA to double the computation rate of the CNN accelerator.Secondly,strategies of feature map block partitioning and special memory arrangement were proposed to optimize the total amount of off-chip access memory and reduce the pressure on FPGA bandwidth.Finally,an efficient computational array combining multiplicative-additive tree and Winograd fast convolution algorithm was designed to balance hardware resource consumption and computational performance.The high parallel CNN accelerator was deployed in ZU3 EG of Alinx,using the YOLOv3-tiny algorithm as the test object.The average computing performance of the CNN accelerator is 127.5 giga operations per second(GOPS).The experimental results show that the hardware architecture effectively improves the computational power of CNN and provides better performance compared with other existing schemes in terms of power consumption and the efficiency of DSPs and block random access memory(BRAMs). 展开更多
关键词 artificial intelligence of things(AIoT) convolutional neural network(CNN)accelerator Winograd convolution field-programmable gate array(FPGA)
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部