摘要
深度学习是当前人工智能领域的关键技术之一,它在图像识别、语音识别、自然语言处理等领域均取得了突破性的成绩,大大推进了人工智能的发展。然而,随着深度学习的发展,它的核心问题也愈显突出,如高计算量、高数据带宽、应用碎片化等,这些问题成为近年来制约相关技术发展的关键因素。本文融合了CPU和专用处理器各自的优势,提出了一种类CPU的深度学习协处理器架构,该架构具有可灵活编程、高计算密度等的特点,同时,由于该处理器还采用了存算紧耦的计算架构,因此能有效重复利用权重等数据,降低了对带宽的需求。本文从硬件架构、软件架构、软件编程模型、软件运行模型等多个角度介绍类CPU的深度学习协处理器架构,同时基于该架构的处理器芯片也已经在28nm下流片成功,进一步验证了该架构的可行性。
Deep learning is one of the key technologies in the field of artificial intelligence.It has made breakthroughs in image recognition,speech recognition,natural language processing and other fields,greatly promoting the development of artificial intelligence.However,with the development of deep learning,its core problems are more and more prominent,such as high computation,high data bandwidth,application fragmentation,etc.,which have become the key factors restricting the development of related technologies in recent years.This paper combines the advantages of CPU and special processor,and proposes a kind of deep learning coprocessor architecture of CPU.This architecture has the characteristics of flexible programming,high computing density,etc.at the same time,because the processor also uses the memory computing tight coupling computing architecture,it can effectively reuse the weight and other data,reducing the demand for bandwidth.This paper introduces the architecture of CPU like deep learning coprocessor from the aspects of hardware architecture,software architecture,software programming model,software running model,etc.at the same time,the processor chip based on the architecture has been successfully tape out at 28nm,which further verifies the feasibility of the architecture.
出处
《中国集成电路》
2020年第7期41-52,共12页
China lntegrated Circuit
基金
“广东省重点领域研发计划项目资助(2019B010140002)”
关键词
深度学习
处理器
存算紧耦
Deep learning
Processor
Memory computing tight coupling