期刊文献+

The Specimen Data Refinery:A Canonical Workflow Framework and FAIR Digital Object Approach to Speeding up Digital Mobilisation of Natural History Collections

原文传递
导出
摘要 A key limiting factor in organising and using information from physical specimens curated in natural science collections is making that information computable,with institutional digitization tending to focus more on imaging the specimens themselves than on efficiently capturing computable data about them.Label data are traditionally manually transcribed today with high cost and low throughput,rendering such a task constrained for many collection-holding institutions at current funding levels.We show how computer vision,optical character recognition,handwriting recognition,named entity recognition and language translation technologies can be implemented into canonical workflow component libraries with findable,accessible,interoperable,and reusable(FAIR)characteristics.These libraries are being developed in a cloudbased workflow plaform-the Specimen Data Refinery'(SDR)-founded on Galaxy workflow engine,Common Workflow Language,Research Object Crates(RO-Crate)and WorkflowHub technologies.The SDR can be applied to specimens'labels and other artefacts,offering the prospect of greatly accelerated and more accurate data capture in computable form.Two kinds of FAIR Digital Objects(FDO)are created by packaging outputs of SDR workflows and workflow components as digital objects with metadata,a persistent identifier,and a specific type definition.The first kind of FDO are computable Digital Specimen(DS)objects that can be consumed/produced by workflows,and other applications.A single DS is the input data structure submitted to a workflow that is modified by each workflow component in turn to produce a refined DS at the end.The Specimen Data Refinery provides a library of such components that can be used individually,or in series.To cofunction,each library component describes the fields it requires from the DS and the fields it will in turn populate or enrich.The second kind of FDO,RO-Crates gather and archive the diverse set of digital and real-world resources,configurations,and actions(the provenance)contributing to a unit of research work,allowing that work to be faithfully recorded and reproduced.Here we describe the Specimen Data Refinery with its motivating requirements,focusing on what is essential in the creation of canonical workflow component libraries and its conformance with the requirements of an emerging FDO Core Specification being developed by the FDO Forum.
出处 《Data Intelligence》 EI 2022年第2期320-341,共22页 数据智能(英文)
基金 funding from the European Union's Horizon 2020 research and innovation programme under grant agreement numbers 823827(SYNTHESYS Plus),871043(DisSCo Prepare),823830(BioExcel-2),824087(EOSC-Life).
  • 相关文献

参考文献2

二级参考文献3

  • 1Keith Jeffery,Peter Wittenburg,Larry Lannom,George Strawn,Claudia Biniossek,Dirk Betz,Christophe Blanchi.Not Ready for Convergence in Data Infrastructures[J].Data Intelligence,2021,3(1):116-135. 被引量:7
  • 2Barend Mons,Erik Schultes,Fenghong Liu,Annika Jacobsen.The FAIR Principles:First Generation Implementation Choices and Challenges[J].Data Intelligence,2020,2(1):1-9. 被引量:4
  • 3Annika Jacobsen,Ricardo de Miranda Azevedo,Nick Juty,Dominique Batista,Simon Coles,Ronald Cornet,Melanie Courtot,Merce Crosas,Michel Dumontier,Chris T.Evelo,Carole Goble,Giancarlo Guizzardi,Karsten Kryger Hansen,Ali Hasnain,Kristina Hettne,Jaap Heringa,Rob W.W.Hooft,Melanie Imming,Keith G.Jeffery,Rajaram Kaliyaperumal,Martijn GKersloot,Christine R.Kirkpatrick,Tobias Kuhn,Ignasi Labastida,Barbara Magagna,PeterMcQuilton,Natalie Meyers,Annalisa Montesanti,Mirjam van Reisen,Philippe Rocca-Serra,Robert Pergl,Susanna-Assunta Sansone,Luiz Olavo Bonino da Silva Santos,Juliane Schneider,George Strawn,Mark Thompson,Andra Waagmeester,Tobias Weigel,Mark D.Wilkinson,Egon L.Willighagen,Peter Wittenburg,Marco Roos,Barend Mons,Erik Schultes.FAIR Principles:Interpretations and Implementation Considerations[J].Data Intelligence,2020,2(1):10-29. 被引量:29

共引文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部