摘要
传统的高性能线性代数计算库如BLAS需要开发者具备丰富的性能优化经验,使用困难。TensorFlow、Pytorch等AI框架提供了简单的开发接口,促进了机器学习应用的发展。这些AI框架大量进行线性代数计算,但是不清楚其是否针对线性代数计算进行了性能优化。设计了一组线性代数计算测试程序,评估了AI框架对的线性代数计算的优化程度。分析显示AI框架在计算图模型下可以有效去除冗余子表达式,但仍然缺少自动识别矩阵链最佳括号的相关优化。未来AI框架可以通过吸收现有高性能线性代数加速库的优化技术进一步提升性能。
Conventional high performance linear algebra libraries like BLAS require application developers of professional performance optimization skill,greatly impeded the growth of the AI application ecosystem.AI frames like Tensorflow and Pytorch instead provide easy-to-use development interfaces,which contributes to the prosperity of AI application ecosystem.These frameworks intensively conduct linear algebra computations,but we are unclear whether these frameworks have sufficiently optimized such computations.The analysis shows that the frameworks are short of certain optimizations such as parenthesization optimization of matrix multiplication chains.Our work provides performance optimization guidance for AI framework and application developers.
作者
胥凌
XU Ling(Xi′an Aeronautics Computing Technique Research Institute,AVIC,Xi′an 710000,China)
出处
《航空计算技术》
2022年第3期5-9,共5页
Aeronautical Computing Technique
基金
航空科学基金项目资助(2018ZC31003)。