Large and growing data resources on the diversity, distribution, and properties of minerals are ushering in a new era of data-driven discovery in mineralogy. The most comprehensive international mineral database is th...Large and growing data resources on the diversity, distribution, and properties of minerals are ushering in a new era of data-driven discovery in mineralogy. The most comprehensive international mineral database is the IMA database, which includes information on more than 5400 approved mineral species and their properties, and the mindat.org data source, which contains more than 1 million species/locality data on minerals found at more than 300 000 localities. Analysis and visualization of these data with diverse techniques—including chord diagrams, cluster diagrams, Klee diagrams, skyline diagrams, and varied methods of network analysis—are leading to a greater understanding of the co-evolving geosphere and biosphere. New data-driven approaches include mineral evolution, mineral ecology, and mineral network analysis—methods that collectively consider the distribution and diversity of minerals through space and time. These strategies are fostering a deeper understanding of mineral co-occurrences and, for the first time, facilitating predictions of mineral species that occur on Earth but have yet to be discovered and described.展开更多
Over four years ago,a group of investigators came together to determine if Big Data approaches(specifically data mining,machine learning and analytics in general)might provide insight into some of the grand challenges...Over four years ago,a group of investigators came together to determine if Big Data approaches(specifically data mining,machine learning and analytics in general)might provide insight into some of the grand challenges in Earth’s history:evolution of minerals,rise of oxygen,life,influence of super continental cycles,quantifying the magnitude of extinction events,and more.As a result,the team of mineralogists,petrologists,geochemists.展开更多
The key to answering many compelling and complex questions in Earth,planetary,and life science lies in breaking down the barriers between scientific fields and harnessing the integrated,multi-disciplinary power of Ear...The key to answering many compelling and complex questions in Earth,planetary,and life science lies in breaking down the barriers between scientific fields and harnessing the integrated,multi-disciplinary power of Earth,planetary,and bioscience data resources.We have a unique opportunity to integrate large and rapidly expanding"big data"resources,to enlist powerful analytical and visualization methods,and to answer multi-disciplinary questions that cannot be addressed by one field alone.展开更多
Ontologies are increasingly deployed as a computer-accessible representation of key semantics in various parts of a data life cycle and, thus, ontology dynamics may pose challenges to data management and re-use. By us...Ontologies are increasingly deployed as a computer-accessible representation of key semantics in various parts of a data life cycle and, thus, ontology dynamics may pose challenges to data management and re-use. By using examples in the field of geosciences, we analyze challenges raised by ontology dynamics, such as heavy reworking of data, semantic heterogeneity among data providers and users, and error propagation in cross-discipline data discovery and re-use. We also make recommendations to address these challenges: (1) communities of practice on ontologies to re- duce inconsistency and duplicated efforts; (2) use ontologies in the procedure of data collection and make them accessible to data users; and (3) seek methods to speed up the reworking of data in a Semantic Web context.展开更多
基金grants from the Alfred P. Sloan Foundation (G-2016-7065)the W. M. Keck Foundation (grant entitled ‘‘Co-Evolution of the Geosphere and Biosphere”), the John Templeton Foundation (60645)the NASA Astrobiology Institute (1-NAI8_2-0007), a private foundation, and the Carnegie Institution for Science. Sergey V. Krivovichev acknowledges support from the Russian Science Foundation (19-17-00038).
文摘Large and growing data resources on the diversity, distribution, and properties of minerals are ushering in a new era of data-driven discovery in mineralogy. The most comprehensive international mineral database is the IMA database, which includes information on more than 5400 approved mineral species and their properties, and the mindat.org data source, which contains more than 1 million species/locality data on minerals found at more than 300 000 localities. Analysis and visualization of these data with diverse techniques—including chord diagrams, cluster diagrams, Klee diagrams, skyline diagrams, and varied methods of network analysis—are leading to a greater understanding of the co-evolving geosphere and biosphere. New data-driven approaches include mineral evolution, mineral ecology, and mineral network analysis—methods that collectively consider the distribution and diversity of minerals through space and time. These strategies are fostering a deeper understanding of mineral co-occurrences and, for the first time, facilitating predictions of mineral species that occur on Earth but have yet to be discovered and described.
基金supported by the WM Keck Foundationthe AP Sloan Foundation.
文摘Over four years ago,a group of investigators came together to determine if Big Data approaches(specifically data mining,machine learning and analytics in general)might provide insight into some of the grand challenges in Earth’s history:evolution of minerals,rise of oxygen,life,influence of super continental cycles,quantifying the magnitude of extinction events,and more.As a result,the team of mineralogists,petrologists,geochemists.
基金supported by the W.M.Keck Foundation’s Deep-Time Data Infrastructure projectsupport by the Deep Carbon Observatory+1 种基金the Alfred P.Sloan Foundationa private foundation,and the Carnegie Institution for Science.
文摘The key to answering many compelling and complex questions in Earth,planetary,and life science lies in breaking down the barriers between scientific fields and harnessing the integrated,multi-disciplinary power of Earth,planetary,and bioscience data resources.We have a unique opportunity to integrate large and rapidly expanding"big data"resources,to enlist powerful analytical and visualization methods,and to answer multi-disciplinary questions that cannot be addressed by one field alone.
文摘Ontologies are increasingly deployed as a computer-accessible representation of key semantics in various parts of a data life cycle and, thus, ontology dynamics may pose challenges to data management and re-use. By using examples in the field of geosciences, we analyze challenges raised by ontology dynamics, such as heavy reworking of data, semantic heterogeneity among data providers and users, and error propagation in cross-discipline data discovery and re-use. We also make recommendations to address these challenges: (1) communities of practice on ontologies to re- duce inconsistency and duplicated efforts; (2) use ontologies in the procedure of data collection and make them accessible to data users; and (3) seek methods to speed up the reworking of data in a Semantic Web context.