期刊文献+

面向机器学习系统的张量中间表示

A tensor intermediate representation for machine learning systems
原文传递
导出
摘要 随着各类机器学习算法的广泛应用,高能效地定制机器学习系统受到越来越多的关注.定制机器学习系统高效部署的关键在于其编程与编译环境.中间表示是编程与编译环境的核心,用于连接上层编程语言和底层硬件指令.当前的中间表示或是面向上层算法或是面向以标量处理为核心的传统处理器,难以高效应对以张量处理为核心的机器学习系统.本文提出了面向机器学习系统的张量中间表示,以提升机器学习系统的编程和运行效率.具体而言,我们定义了一系列张量类型,张量操作及张量存储空间,并在此基础上进行张量处理优化.我们将所提出的张量中间表示对TVM的底层标量中间表示进行了扩展并在典型机器学习系统上进行了实验.我们探索了原有中间表示没有发掘的优化并取得了1.62~2.85倍的性能提升,同时在典型算子的开发效率上平均提升了5.46倍. With the wide deployment of various machine learning algorithms,highly energy-efficient customized machine learning systems have gained popularity.The machine learning compilers are crucial to machine learning systems.The intermediate representation is the key to programming and compilation environments,and it connects the high-level programming language and the lower-level instruction set architectures.The current stateof-the-art intermediate representations are either oriented to high-level algorithms or classical processors based on scalar processing,but they cannot be effectively implemented on tensor-based machine learning systems.To address this problem,we propose a tensor intermediate representation for machine learning systems to improve programming productivity and performance.Concretely,we define a series of tensor types,tensor operations,and tensor memories and optimize the tensor processing based on these definitions.To validate our proposal,we extend the proposed tensor intermediate representation to the low-level scalar intermediate representation of TVM and perform experiments with Tensor Core on a typical machine learning system.Experimental results show that we explore optimizations that are not discovered in the original intermediate representation and achieve1.62×~2.85×performance improvement.Besides,the tensor intermediate representation improves the efficiency of programming by 5.46×on average.
作者 庄毅敏 文渊博 李威 郭崎 Yimin ZHUANG;Yuanbo WEN;Wei LI;Qi GUO(State Key Laboratory of Computer Architecture,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China;University of Chinese Academy of Sciences,Beijing 100049,China;School of Computer Science and Technology,University of Science and Technology of China,Hefei 230026,China;Cambricon Technologies Corporation Limited,Shanghai 201308,China)
出处 《中国科学:信息科学》 CSCD 北大核心 2022年第6期1040-1052,共13页 Scientia Sinica(Informationis)
基金 国家自然科学基金(批准号:61925208) 北京市自然科学基金(批准号:JQ18013) 中国科学院战略性先导科技专项(批准号:XDB32050200) 中国科学院稳定支持基础研究领域青年团队计划(批准号:YSBR-029)和中国科学院青年创新促进会资助项目。
关键词 机器学习系统 编程与编译 张量处理 中间表示 编程效率 machine learning systems programming&compiling tensor processing intermediate representation programming efficiency
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部