摘要
深度学习平台在新一代人工智能的发展中扮演着重要的角色。近年来,以昇腾平台为代表的国产人工智能软硬件系统快速发展,为国产深度学习平台的发展开辟出了新的道路。与此同时,为了发现并解决昇腾系统存在的潜在漏洞,昇腾平台积极开展常用深度学习模型的迁移工作。从自然语言处理算法角度切入,针对机器阅读理解、神经机器翻译、序列标注和文本分类四大自然语言处理任务,以昇腾平台的高性能硬件芯片为基础,探究迁移ALBERT,RNNSearch,BERT-CRF和TextING这4类典型的自然语言处理模型。基于以上迁移研究,发现和整理了昇腾平台架构设计在自然语言处理研究与业务上的主要不足,即计算图节点动态空间的分配特性、资源算子下沉设备侧、图算融合以及混合精度训练4个方面的问题,并为以上问题提出了相应的解决方案,并进行了实验验证。最后,为国产深度学习平台的发展提出未来优化的方向和相关建议。
Deep learning platformplays an essential role in the development of the new generation of artificial intelligence.In recent years,the domestic artificial intelligence high-performance software and hardware system of China represented by the Ascend platform has developed rapidly,which opens up a new way for the deep learning platform in China.At the same time,in order to explore and solve the potential loopholes in the Ascend system,the platform developers of Ascend actively carries out the migration of commonly used deep learning models with researchers.These efforts are further promoted from the perspective of natural language processingaiming at how to refine the domestic deep learning platform.Four natural language processing tasks arehighlighted,neural machine translation,machine reading comprehension,sequence labeling and text classification,along with four classical neural models,Albert,RNNSearch,BERT-CRF and TextING.They are migrated on the Ascend platform in details.Based on the above model migration research,this paper integrates the deficiencies of the architecture design of the Ascend platform in the research and business in natural language processing.In conclusion,these deficiencies are sorted out as four essential aspects:1)the lack of the dynamic space allocation characteristics of computing graph nodes;2)incompatibility for the sinking of resource operators on the acceleration-deviceside;3)the fusion of graphics and computing which is not flexible to handle unseen model structures,and 4)the defects of the mixed-precision training strategy.To overcome these problems,this paper puts forward the avoidance methods or solutions.Finally,constructive suggestions are provided for,including but not limited to,the deep-learning platforms in China.
作者
葛慧斌
王德鑫
郑涛
张婷
熊德意
GE Huibin;WANG Dexin;ZHENG Tao;ZHANG Ting;XIONG Deyi(College of Intelligence and Computing,Tianjin University,Tianjin 300350,China;Nanjing Research Institute,Huawei Technologies Co.Ltd.,Nanjing 210000,China;Global Tone Communication Technology Co.,Ltd.,Beijing 100131,China)
出处
《计算机科学》
CSCD
北大核心
2024年第1期50-59,共10页
Computer Science
基金
华为技术有限公司与天津大学NRE合作项目(20101300441922C)
国家重点研发计划(2020AAA0108000)
云南省重点研发计划(202203AA080004)。
关键词
自然语言处理
昇腾
深度学习
模型迁移
平台构架
Natural language processing
Ascend
Deep learning
Model migration
Platform architecture