期刊文献+

FAIR Enough:Develop and Assess a FAIR-Compliant Dataset for Large Language Model Training?

原文传递
导出
摘要 The rapid evolution of Large Language Models(LLMs) highlights the necessity for ethical considerations and data integrity in AI development, particularly emphasizing the role of FAIR(Findable, Accessible, Interoperable, Reusable) data principles. While these principles are crucial for ethical data stewardship, their specific application in the context of LLM training data remains an under-explored area. This research gap is the focus of our study, which begins with an examination of existing literature to underline the importance of FAIR principles in managing data for LLM training. Building upon this, we propose a novel frame-work designed to integrate FAIR principles into the LLM development lifecycle. A contribution of our work is the development of a comprehensive checklist intended to guide researchers and developers in applying FAIR data principles consistently across the model development process. The utility and effectiveness of our frame-work are validated through a case study on creating a FAIR-compliant dataset aimed at detecting and mitigating biases in LLMs. We present this framework to the community as a tool to foster the creation of technologically advanced, ethically grounded, and socially responsible AI models.
出处 《Data Intelligence》 EI 2024年第2期559-585,共27页 数据智能(英文)
  • 相关文献

参考文献2

  • 1Annika Jacobsen,Ricardo de Miranda Azevedo,Nick Juty,Dominique Batista,Simon Coles,Ronald Cornet,Melanie Courtot,Merce Crosas,Michel Dumontier,Chris T.Evelo,Carole Goble,Giancarlo Guizzardi,Karsten Kryger Hansen,Ali Hasnain,Kristina Hettne,Jaap Heringa,Rob W.W.Hooft,Melanie Imming,Keith G.Jeffery,Rajaram Kaliyaperumal,Martijn GKersloot,Christine R.Kirkpatrick,Tobias Kuhn,Ignasi Labastida,Barbara Magagna,PeterMcQuilton,Natalie Meyers,Annalisa Montesanti,Mirjam van Reisen,Philippe Rocca-Serra,Robert Pergl,Susanna-Assunta Sansone,Luiz Olavo Bonino da Silva Santos,Juliane Schneider,George Strawn,Mark Thompson,Andra Waagmeester,Tobias Weigel,Mark D.Wilkinson,Egon L.Willighagen,Peter Wittenburg,Marco Roos,Barend Mons,Erik Schultes.FAIR Principles:Interpretations and Implementation Considerations[J].Data Intelligence,2020,2(1):10-29. 被引量:30
  • 2Luana Sales,Patricia Henning,Viviane Veiga,Maira Murrieta Costa,Luis Fernando Sayao,Luiz Olavo Bonino da Silva Santos,Luis Ferreira Pires.GO FAIR Brazil:A Challenge for Brazilian Data Science[J].Data Intelligence,2020,2(1):238-245. 被引量:6

二级参考文献5

共引文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部