Deep learning accelerators(DLAs)have been proved to be efficient computational devices for processing deep learning algorithms.Various DLA architectures are proposed and applied to different applications and tasks.How...Deep learning accelerators(DLAs)have been proved to be efficient computational devices for processing deep learning algorithms.Various DLA architectures are proposed and applied to different applications and tasks.However,for most DLAs,their programming interfaces are either difficult to use or not efficient enough.Most DLAs require programmers to directly write instructions,which is time-consuming and error-prone.Another prevailing programming interface for DLAs is high-performance libraries and deep learning frameworks,which are easy to be used and very friendly to users,but their high abstraction level limits their control capacity over the hardware resources thus compromises the efficiency of the accelerator.A design of the programming interface is for DLAs.First various existing DLAs and their programming methods are analyzed and a methodology for designing programming interface for DLAs is proposed,which is a high-level assembly language(called DLA-AL),assembler and runtime for DLAs.DLA-AL is composed of a low-level assembly language and a set of high-level blocks.It allows experienced experts to fully exploit the potential of DLAs and achieve near-optimal performance.Meanwhile,by using DLA-AL,end-users who have little knowledge of the hardware are able to develop deep learning algorithms on DLAs spending minimal programming efforts.展开更多
基金Supported by the National Key Research and Development Program of China(No.2017YFA0700902,2017YFB1003101)the 973 Program of China(No.2015CB358800)National Science and Technology Major Project(No.2018ZX01031102)
文摘Deep learning accelerators(DLAs)have been proved to be efficient computational devices for processing deep learning algorithms.Various DLA architectures are proposed and applied to different applications and tasks.However,for most DLAs,their programming interfaces are either difficult to use or not efficient enough.Most DLAs require programmers to directly write instructions,which is time-consuming and error-prone.Another prevailing programming interface for DLAs is high-performance libraries and deep learning frameworks,which are easy to be used and very friendly to users,but their high abstraction level limits their control capacity over the hardware resources thus compromises the efficiency of the accelerator.A design of the programming interface is for DLAs.First various existing DLAs and their programming methods are analyzed and a methodology for designing programming interface for DLAs is proposed,which is a high-level assembly language(called DLA-AL),assembler and runtime for DLAs.DLA-AL is composed of a low-level assembly language and a set of high-level blocks.It allows experienced experts to fully exploit the potential of DLAs and achieve near-optimal performance.Meanwhile,by using DLA-AL,end-users who have little knowledge of the hardware are able to develop deep learning algorithms on DLAs spending minimal programming efforts.