期刊文献+

生成式AI训练数据的法律风险及其元规制

Legal Risks and Meta-regulation of Generative AI Training Data
下载PDF
导出
摘要 以ChatGPT为代表的生成式AI依托于海量的训练数据来实现模型的迭代升级,训练数据的质量和数量直接决定着生成式AI的性能和泛化能力。然而,训练数据本身潜藏着来源合法性、质量可信性、规模偏离性等风险,自我规制与政府规制路径都难以契合生成式AI的市场布局与更迭速度,亟须在包容审慎理念下对生成式AI训练数据予以元规制。在元规制理念下,国家通过规范引导模型研发者将经设计的数据保护与科技伦理理念内嵌于生成式AI的训练数据中,促成数据保护从利用环节延伸至研发环节,通过可信的数据来源、数据分类分级、数据影响评估等措施促成模型研发者自我观照式的内省,并经由数据保护的监管沙盒实现自我规制的规制。 Generative AI represented by ChatGPT relies on massive training data to realize iterative upgrade of the model,and the quality and quantity of training data directly determine the performance and generalization ability of generative AI.However,the training data itself has hidden risks such as source legitimacy,quality credibility,scale deviation and so on.Both self-regulation and government regulation paths are difficult to match the market layout and changing speed of generative AI,so it is urgent to meta-regulate generative AI training data under the concept of inclusive and prudence.Under the concept of meta-regulation,the state guides model developers to embed the data protection by design and technological ethics into the training data of generative AI through regulations,so as to extend data protection from the utilization stage to the research and development stage.Through credible data sources,data classification,data impact assessment and other measures,it promotes model developers to conduct self-reflection and achieve meta-regulation through data protection regulatory sandbox.
作者 王海洋 Wang Haiyang(Southwest University of Political Science&Law,Chongqing 401120)
出处 《浙江社会科学》 北大核心 2024年第9期50-63,157,158,共16页 Zhejiang Social Sciences
基金 2023年度国家资助博士后研究人员计划“个人信息的整全性保护及其衔接机制研究”(GZC20232201)的阶段性成果。
关键词 生成式AI ChatGPT 训练数据 元规制 generative AI ChatGPT training data meta-regulation
  • 相关文献

二级参考文献463

共引文献1484

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部