期刊文献+

鹏程·盘古:大规模自回归中文预训练语言模型及应用 被引量:4

Pengcheng-PanGu:Large-Scale Autoregressive Pre-Trained Chinese Language Model with Auto-Parallel Computation and Its Application
下载PDF
导出
摘要 在鹏城云脑Ⅱ上训练了全球首个拥有全开源2000亿参数的自回归中文预训练语言大模型——鹏程·盘古。鹏程·盘古模型基于1.1 TB高质量中文训练数据,采用全场景人工智能计算框架MindSpore自动并行技术实现了五维并行训练策略,从而可将训练任务高效扩展到4096个处理器上。对比实验表明,在少样本或零样本情况下,鹏程·盘古模型在多个中文自然语言理解或生成任务上都具有较优的性能。在此基础上,鹏程·盘古模型在大模型压缩、提示微调学习、多任务学习以及持续学习等方面也取得了很好的应用效果。 The world's first large-scale autoregressive pre-trained Chinese language model named Pengcheng-PanGu with up to 200 billion parameters is presented.Pengcheng-PanGu is developed under the Pengcheng cloud brainⅡ.1.1 TB high-quality Chinese data from a wide range of domains to pre-train the model are collected.The training parallelism strategy is implemented based on all-scenarios artificial intelligence computing framework MindSpore Auto-parallel,which composes five parallelism dimensions to scale the training task to 4096 processors efficiently.The experimental results demonstrate the superior capabilities of Pengcheng-PanGu in performing various natural lan⁃guage understanding and natural language generation tasks under few-shot or zero-shot settings.On this basis,Pengcheng-PanGu model has also achieved better application results in large model compression,prompt fine-tuning,multi-task,and continuous learning.
作者 曾炜 苏腾 王晖 田永鸿 高文 ZENG Wei;SU Teng;WANG Hui;TIAN Yonghong;GAO Wen(Pengcheng Laboratory,Shenzhen 518055,China;Peking University,Beijing 100871,China;Huawei Technologies Co.,Ltd.,Hangzhou 310052,China)
出处 《中兴通讯技术》 2022年第2期33-43,共11页 ZTE Technology Journal
基金 广东省重点领域研发计划“新一代人工智能”重大专项(2021B0101400002)。
关键词 大规模预训练语言模型 鹏城云脑Ⅱ 大规模分布式训练 中文理解与生成 提示微调学习 large-scale pre-trained language models Pengcheng cloud brainⅡ large-scale distributed training Chinese language under⁃standing and generation tip fine-tuning learning
  • 相关文献

同被引文献30

引证文献4

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部