As a computing paradigm that combines temporal and spatial computations,dynamic reconfigurable computing provides superiorities of flexibility,energy efficiency and area efficiency,attracting interest from both academ...As a computing paradigm that combines temporal and spatial computations,dynamic reconfigurable computing provides superiorities of flexibility,energy efficiency and area efficiency,attracting interest from both academia and industry.However,dynamic reconfigurable computing is not yet mature because of several unsolved problems.This work introduces the concept,architecture,and compilation techniques of dynamic reconfigurable computing.It also discusses the existing major challenges and points out its potential applications.展开更多
Today,integrated circuit technology is approaching the physical limit.From performance and energy consumption perspective,reconfigurable computing is regarded as the most promising technology for future computing syst...Today,integrated circuit technology is approaching the physical limit.From performance and energy consumption perspective,reconfigurable computing is regarded as the most promising technology for future computing systems with excellent feature in computing and energy efficiency.From the perspective of computing performance,compared with single thread performance stagnation of general purpose processors(GPPS),reconfigurable computing may customize hardware according to application requirements,so as to achieve higher performance and lower energy consumption.From the perspective of economics,a microchip based on reconfigurable computing technology has post-silicon reconfigurability,which can be applied in different fields,so as to better share the cost of non-recurring engineering(NRE).High computing and energy efficiency together with unique reconfigurability make reconfigurable computing one of the most important technologies of artificial intelligent microchips.展开更多
Convolutional Neural Networks(CNNs)are widely used in computer vision,natural language processing,and so on,which generally require low power and high efficiency in real applications.Thus,energy efficiency has become ...Convolutional Neural Networks(CNNs)are widely used in computer vision,natural language processing,and so on,which generally require low power and high efficiency in real applications.Thus,energy efficiency has become a critical indicator of CNN accelerators.Considering that asynchronous circuits have the advantages of low power consumption,high speed,and no clock distribution problems,we design and implement an energy-efficient asynchronous CNN accelerator with a 65 nm Complementary Metal Oxide Semiconductor(CMOS)process.Given the absence of a commercial design tool flow for asynchronous circuits,we develop a novel design flow to implement Click-based asynchronous bundled data circuits efficiently to mask layout with conventional Electronic Design Automation(EDA)tools.We also introduce an adaptive delay matching method and perform accurate static timing analysis for the circuits to ensure correct timing.The accelerator for handwriting recognition network(LeNet-5 model)is implemented.Silicon test results show that the asynchronous accelerator has 30%less power in computing array than the synchronous one and that the energy efficiency of the asynchronous accelerator achieves 1.538 TOPS/W,which is 12%higher than that of the synchronous chip.展开更多
Auto-focus is very important for capturing sharp human face centered images in digital and smart phone cameras. With the development of image sensor technology, these cameras support more and more highresolution image...Auto-focus is very important for capturing sharp human face centered images in digital and smart phone cameras. With the development of image sensor technology, these cameras support more and more highresolution images to be processed. Currently it is difficult to support fast auto-focus at low power consumption on high-resolution images. This work proposes an efficient architecture for an Ada Boost-based face-priority auto-focus. The architecture supports block-based integral image computation to improve the processing speed on high-resolution images; meanwhile, it is reconfigurable so that it enables the sub-window adaptive cascade classification, which greatly improves the processing speed and reduces power consumption. Experimental results show that 96% detection rate in average and 58 fps(frame per second) detection speed are achieved for the1080p(1920×1080) images. Compared with the state-of-the-art work, the detection speed is greatly improved and power consumption is largely reduced.展开更多
基金supported in part by the National Science and Technology Major Project of the Ministry of Science and Technology of China (Grant No. 2018ZX01028201)in part by the National Natural Science Foundation of China (Grant No. 61672317, No. 61834002)in part by the National Key R&D Program of China (Grant No. 2018YFB2202101)
文摘As a computing paradigm that combines temporal and spatial computations,dynamic reconfigurable computing provides superiorities of flexibility,energy efficiency and area efficiency,attracting interest from both academia and industry.However,dynamic reconfigurable computing is not yet mature because of several unsolved problems.This work introduces the concept,architecture,and compilation techniques of dynamic reconfigurable computing.It also discusses the existing major challenges and points out its potential applications.
文摘Today,integrated circuit technology is approaching the physical limit.From performance and energy consumption perspective,reconfigurable computing is regarded as the most promising technology for future computing systems with excellent feature in computing and energy efficiency.From the perspective of computing performance,compared with single thread performance stagnation of general purpose processors(GPPS),reconfigurable computing may customize hardware according to application requirements,so as to achieve higher performance and lower energy consumption.From the perspective of economics,a microchip based on reconfigurable computing technology has post-silicon reconfigurability,which can be applied in different fields,so as to better share the cost of non-recurring engineering(NRE).High computing and energy efficiency together with unique reconfigurability make reconfigurable computing one of the most important technologies of artificial intelligent microchips.
基金supported by National Science and Technology Major Project from Minister of Science and Technology,China(No.2018AAA0103100)the National Natural Science Foundation of China(No.61674090)+1 种基金partly supported by Beijing National Research Center for Information Science and Technology(No.042003266)Beijing Engineering Research Center(No.BG0149)。
文摘Convolutional Neural Networks(CNNs)are widely used in computer vision,natural language processing,and so on,which generally require low power and high efficiency in real applications.Thus,energy efficiency has become a critical indicator of CNN accelerators.Considering that asynchronous circuits have the advantages of low power consumption,high speed,and no clock distribution problems,we design and implement an energy-efficient asynchronous CNN accelerator with a 65 nm Complementary Metal Oxide Semiconductor(CMOS)process.Given the absence of a commercial design tool flow for asynchronous circuits,we develop a novel design flow to implement Click-based asynchronous bundled data circuits efficiently to mask layout with conventional Electronic Design Automation(EDA)tools.We also introduce an adaptive delay matching method and perform accurate static timing analysis for the circuits to ensure correct timing.The accelerator for handwriting recognition network(LeNet-5 model)is implemented.Silicon test results show that the asynchronous accelerator has 30%less power in computing array than the synchronous one and that the energy efficiency of the asynchronous accelerator achieves 1.538 TOPS/W,which is 12%higher than that of the synchronous chip.
基金supported in part by China Major Science and Technology (S&T) Project (Grant No. 2013ZX01033-001-001-003)National High-Tech R&D Program of China (863) (Grant Nos. 2012AA012701, 2012AA0109-04)+2 种基金National Natural Science Foundation of China (Grant No. 61274131)International S&T Cooperation Project of China (Grant No. 2012DFA11170)Importation and Development of the High-Caliber Talents Project of Beijing Municipal Institutions (Grant No. YETP0163)
文摘Auto-focus is very important for capturing sharp human face centered images in digital and smart phone cameras. With the development of image sensor technology, these cameras support more and more highresolution images to be processed. Currently it is difficult to support fast auto-focus at low power consumption on high-resolution images. This work proposes an efficient architecture for an Ada Boost-based face-priority auto-focus. The architecture supports block-based integral image computation to improve the processing speed on high-resolution images; meanwhile, it is reconfigurable so that it enables the sub-window adaptive cascade classification, which greatly improves the processing speed and reduces power consumption. Experimental results show that 96% detection rate in average and 58 fps(frame per second) detection speed are achieved for the1080p(1920×1080) images. Compared with the state-of-the-art work, the detection speed is greatly improved and power consumption is largely reduced.