期刊文献+

语言资源视角下的大规模语言模型治理 被引量:4

Governance of Large Language Models from the Perspective of Language Resources
下载PDF
导出
摘要 近半年来,柴语生(ChatGPT)等大规模生成式语言模型的应用,引发了全社会的关注和反思。对这种大模型,应以工具观加以正视,认可其技术发展带来的益处,同时尽量规避其风险。对它们的治理,应减少对技术本身的干预,将目标定位于大模型赖以研发的语言资源和投放之后的使用。对大模型研发中的语言资源治理,应着力打破中文数据孤岛:发展以联邦学习为代表的分布式模型构建技术,建立国家知识数据开放机制,尽快健全开放、高效的语言数据交换市场;提倡世界知识中文表达,助推中文大模型研发:尽快实现中文精华知识资源面向网络开放,完善中文概念、术语资源,做大、做全领域中文资源。对大模型使用领域的治理,则因大模型本身也是一种重要的语言资源,故应强调其基础资源地位,从标准化、评测和伦理规制的角度进行。 Over the past six months,the application of large language models such as ChatGPT has drawn international attention and sparked critical reflection in the whole world.In this paper,it is argued that these large language models should be viewed as instrumental tools that bring about benefits with their technological development as well as risks in the application.Consequently,their governance should be focused less on technological intervention,and more on language resources vital for their development and application.Regarding the governance of language resources in large language model development,eff orts should be made to break down the data silos of Chinese language resources,develop distributed model construction technologies through federated learning,establish open-accessed national knowledge data mechanisms,and expand the open and efficient language data exchange markets.These eff orts are aimed to promote Chinese expression of world knowledge and facilitate the development of Chinese large language models.Since the large language models are an important language resource in nature,their fundamental resource status should be emphasized in the application,and perspectives of standardization,evaluation,and ethical regulation should be taken in their governance.
作者 饶高琦 胡星雨 易子琳 Rao Gaoqi;Hu Xingyu;Yi Zilin
出处 《语言战略研究》 北大核心 2023年第4期19-29,共11页 Chinese Journal of Language Policy and Planning
基金 教育部人文社科青年项目“清末以来汉语报刊词汇使用计量研究”(20YJC740050) 北京语言大学梧桐创新平台(21PT04)。
关键词 柴语生 语言资源 大规模语言模型 语言治理 ChatGPT language resources large language model language governance
  • 相关文献

参考文献21

二级参考文献130

共引文献752

同被引文献42

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部