Biological databases serve as a global fundamental infrastructure for the worldwide scientific community,which dramatically aid the transformation of big data into knowledge discovery and drive significant innovations...Biological databases serve as a global fundamental infrastructure for the worldwide scientific community,which dramatically aid the transformation of big data into knowledge discovery and drive significant innovations in a wide range of research fields.Given the rapid data production,biological databases continue to increase in size and importance.To build a catalog of worldwide biological databases,we curate a total of 5825 biological databases from 8931 publications,which are geographically distributed in 72 countries/regions and developed by 1975 institutions(as of September 20,2022).We further devise a z-index,a novel index to characterize the scientific impact of a database,and rank all these biological databases as well as their hosting institutions and countries in terms of citation and z-index.Consequently,we present a series of statistics and trends of worldwide biological databases,yielding a global perspective to better understand their status and impact for life and health sciences.An up-to-date catalog of worldwide biological databases,as well as their curated meta-information and derived statistics,is publicly available at Database Commons(https://ngdc.cncb.ac.cn/databasecommons/).展开更多
TSdb (http://tsdb.cbi.pku.edu.cn) is the first manually curated central repository that stores formatted information on the substrates of transporters. In total, 37608 transporters with 15075 substrates from 884 organ...TSdb (http://tsdb.cbi.pku.edu.cn) is the first manually curated central repository that stores formatted information on the substrates of transporters. In total, 37608 transporters with 15075 substrates from 884 organisms were curated from UniProt functional annotation. A unique feature of TSdb is that all the substrates are mapped to identifiers from the KEGG Ligand com- pound database. Thus, TSdb links current metabolic pathway schema with compound transporter systems via the shared compounds in the pathways. Furthermore, all the transporter substrates in TSdb are classified according to their biochemical properties, biological roles and subcellular localizations. In addition to the functional annotation of transporters, extensive compound annotation that includes inhibitor information from the KEGG Ligand and BRENDA databases has been integrated, making TSdb a useful source for the discovery of potential inhibitory mechanisms linking transporter substrates and metabolic enzymes. User-friendly web interfaces are designed for easy access, query and download of the data. Text and BLAST searches against all transporters in the database are provided. We will regularly update the substrate data with evidence from new publications.展开更多
Biological data,represented by the data from omics platforms,are accumulating exponentially.As some other data-intensive scientific disciplines such as high-energy physics,climatology,meteorology,geology,geography and...Biological data,represented by the data from omics platforms,are accumulating exponentially.As some other data-intensive scientific disciplines such as high-energy physics,climatology,meteorology,geology,geography and environmental sciences,modern life sciences have entered the information-rich era,the era of the 4th paradigm.The creation of Chinese information engineering infrastructure for pan-omics studies(CIEIPOS) has been long overdue as part of national scientific infrastructure,in accelerating the further development of Chinese life sciences,and translating rich data into knowledge and medical applications.By gathering facts of current status of international and Chinese bioinformatics communities in collecting,managing and utilizing biological data,the essay stresses the significance and urgency to create a 'data hub' in CIEIPOS,discusses challenges and possible solutions to integrate,query and visualize these data.Another important component of CIEIPOS,which is not part of traditional biological data centers such as NCBI and EBI,is omics informatics.Mass spectroscopy platform was taken as an example to illustrate the complexity of omics informatics.Its heavy dependency on computational power is highlighted.The demand for such power in omics studies is argued as the fundamental function to meet for CIEIPOS.Implementation outlook of CIEIPOS in hardware and network is discussed.展开更多
基金supported by grants from the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant Nos.XDA19090116 and XDA19050302)the National Natural Science Foundation of China(Grant Nos.31871328 and 32030021)+2 种基金the Professional Association of the Alliance of International Science Organizations(Grant No.ANSO-PA-2020-07)the Youth Innovation Promotion Association of Chinese Academy of Sciences(Grant No.2019104)the International Partnership Program of the Chinese Academy of Sciences(Grant No.153F11KYSB20160008).
文摘Biological databases serve as a global fundamental infrastructure for the worldwide scientific community,which dramatically aid the transformation of big data into knowledge discovery and drive significant innovations in a wide range of research fields.Given the rapid data production,biological databases continue to increase in size and importance.To build a catalog of worldwide biological databases,we curate a total of 5825 biological databases from 8931 publications,which are geographically distributed in 72 countries/regions and developed by 1975 institutions(as of September 20,2022).We further devise a z-index,a novel index to characterize the scientific impact of a database,and rank all these biological databases as well as their hosting institutions and countries in terms of citation and z-index.Consequently,we present a series of statistics and trends of worldwide biological databases,yielding a global perspective to better understand their status and impact for life and health sciences.An up-to-date catalog of worldwide biological databases,as well as their curated meta-information and derived statistics,is publicly available at Database Commons(https://ngdc.cncb.ac.cn/databasecommons/).
基金supported by the National High Technology Research and Development Program of China (Grant Nos. 2006AA02Z334, 2006AA02Z314, 2006AA02A312 and 2007AA02Z165)the National Basic Research Program of China (Grant Nos. 2006CB910404 and 2007CB946904)support of the K. C. Wong Education Foundation, Hong Kong
文摘TSdb (http://tsdb.cbi.pku.edu.cn) is the first manually curated central repository that stores formatted information on the substrates of transporters. In total, 37608 transporters with 15075 substrates from 884 organisms were curated from UniProt functional annotation. A unique feature of TSdb is that all the substrates are mapped to identifiers from the KEGG Ligand com- pound database. Thus, TSdb links current metabolic pathway schema with compound transporter systems via the shared compounds in the pathways. Furthermore, all the transporter substrates in TSdb are classified according to their biochemical properties, biological roles and subcellular localizations. In addition to the functional annotation of transporters, extensive compound annotation that includes inhibitor information from the KEGG Ligand and BRENDA databases has been integrated, making TSdb a useful source for the discovery of potential inhibitory mechanisms linking transporter substrates and metabolic enzymes. User-friendly web interfaces are designed for easy access, query and download of the data. Text and BLAST searches against all transporters in the database are provided. We will regularly update the substrate data with evidence from new publications.
基金financial support of Taicang government,Suzhou,China
文摘Biological data,represented by the data from omics platforms,are accumulating exponentially.As some other data-intensive scientific disciplines such as high-energy physics,climatology,meteorology,geology,geography and environmental sciences,modern life sciences have entered the information-rich era,the era of the 4th paradigm.The creation of Chinese information engineering infrastructure for pan-omics studies(CIEIPOS) has been long overdue as part of national scientific infrastructure,in accelerating the further development of Chinese life sciences,and translating rich data into knowledge and medical applications.By gathering facts of current status of international and Chinese bioinformatics communities in collecting,managing and utilizing biological data,the essay stresses the significance and urgency to create a 'data hub' in CIEIPOS,discusses challenges and possible solutions to integrate,query and visualize these data.Another important component of CIEIPOS,which is not part of traditional biological data centers such as NCBI and EBI,is omics informatics.Mass spectroscopy platform was taken as an example to illustrate the complexity of omics informatics.Its heavy dependency on computational power is highlighted.The demand for such power in omics studies is argued as the fundamental function to meet for CIEIPOS.Implementation outlook of CIEIPOS in hardware and network is discussed.