期刊文献+

ChatGPT大模型技术发展与应用 被引量:6

Large Language Model ChatGPT:Evolution and Application
下载PDF
导出
摘要 通过回顾深度学习、语言模型、语义表示和预训练技术的发展历程,全面解析了ChatGPT的技术渊源和演进路线。在语言模型方面,从早期的N-gram统计方法逐步演进到神经网络语言模型,通过对机器翻译技术的研究和影响也催生了Transformer的出现,继而又推动了神经网络语言模型的发展。在语义表示和预训练技术发展方面,从早期的TF-IDF、pLSA和LDA等统计方法发展到Word2Vec等基于神经网络的词向量表示,继而发展到ELMo、BERT和GPT-2等预训练语言模型,预训练框架日益成熟,为模型提供了丰富的语义知识。GPT-3的出现揭示了大语言模型的潜力,但依然存在幻觉问题,如生成不可控、知识谬误及逻辑推理能力差等。为了缓解这些问题,ChatGPT通过指令学习、监督微调、基于人类反馈的强化学习等方式在GPT-3.5上进一步与人类进行对齐学习,效果不断提升。ChatGPT等大模型的出现,标志着该领域技术进入新的发展阶段,为人机交互以及通用人工智能的发展开辟了新的可能。 This paper comprehensively analyzes the technical origins and evolution of ChatGPT by reviewing the development of deep learning,language models,semantic representation and pre-training techniques.In terms of language models,the early N-gram statistical method gradually evolved into the neural network language models.Researches and advancements on machine translation also led to the emergence of Transformer,which in turn catalyzed the development of neural network language models.Recording semantic representation and pre-training techniques,there has been an evolution from early statistical methods such as TF-IDF,pLSA and LDA,to neural network-based word vector representations like Word2Vec,and then to pre-trained language models,like ELMo,BERT and GPT-2.The pre-training frameworks have become increasingly sophisticated,providing rich semantic knowledge for models.The emergency of GPT-3 revealed the potential of large language models,but hallucination problems like uncontrollable generation,knowledge fallacies and poor logical reasoning capability still existed.To alleviate these problems,ChatGPT aligned further with humans on GPT⁃3.5 through instruction learning,supervised fine-tuning,and reinforcement learning from human feedback,continuously improving its capabilities.The emergency of large language models like ChatGPT signifies this field entering a new developmental stage,opening up new possibilities for human-computer interaction and general artificial intelligence.
作者 夏润泽 李丕绩 XIA Runze;LI Piji(College of Computer Science and Technology,Nanjing University of Aeronautics&Astronautics,Nanjing 211106,China;MIIT Key Laboratory of Pattern Analysis and Machine Intelligence(Nanjing University of Aeronautics&Astronautics),Nanjing 211106,China)
出处 《数据采集与处理》 CSCD 北大核心 2023年第5期1017-1034,共18页 Journal of Data Acquisition and Processing
关键词 自然语言处理 语言模型 预训练技术 ChatGPT natural language processing language model pre-training technique ChatGPT
  • 相关文献

同被引文献36

引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部