Publicly-accessible resources have promoted the advance of scientific discovery. The era of genomics and big data has brought the need for collaboration and data sharing in order to make effective use of this new know...Publicly-accessible resources have promoted the advance of scientific discovery. The era of genomics and big data has brought the need for collaboration and data sharing in order to make effective use of this new knowledge. Here, we describe the web resources for cancer genomics research and rate them on the basis of the diversity of cancer types, sample size, omics data comprehensiveness, and user experience. The resources reviewed include data repository and analysis tools; and we hope such introduction will promote the awareness and facilitate the usage of these resources in the cancer research community.展开更多
Genome data of severe acute respiratory syndrome coronavirus 2(SARS-CoV-2)is essential for virus diagnosis,vaccine development,and variant surveillance.To archive and integrate worldwide SARS-CoV-2 genome data,a serie...Genome data of severe acute respiratory syndrome coronavirus 2(SARS-CoV-2)is essential for virus diagnosis,vaccine development,and variant surveillance.To archive and integrate worldwide SARS-CoV-2 genome data,a series of resources have been constructed,serving as a fundamental infrastructure for SARS-CoV-2 research,pandemic prevention and control,and coronavirus disease 2019(COVID-19)therapy.Here we present an over-view of extant SARS-CoV-2 resources that are devoted to genome data deposition and integration.We review deposition resources in data accessibility,metadata standardization,data curation and annotation;review integrative resources in data source,de-redundancy processing,data curation and quality assessment,and variant annotation.Moreover,we address issues that impede SARS-CoV-2 genome data integration,including low-complexity,inconsistency and absence of isolate name,sequence inconsistency,asynchronous update of genome data,and mismatched metadata.We finally provide insights into data standardization consensus and data submission guidelines,to promote SARS-CoV-2 genome data sharing and integration.展开更多
In the past decade, the remarkable development of high-throughput sequencing technology accelerates the generation of large amount of multiple dimensional data such as genomic, epigenomic, transcriptomic and proteomic...In the past decade, the remarkable development of high-throughput sequencing technology accelerates the generation of large amount of multiple dimensional data such as genomic, epigenomic, transcriptomic and proteomic data. The comprehensive data make it possible to understand the underlying mechanisms of biology and disease such as cancer systematically. It also provides great challenges for computa- tional cancer genomics due to the complexity, scale and noise of data. In this article, we aim to review the recent develop- ments and progresses of computational models, algorithms and analysis of complex data in cancer genomics. These topics of this paper include the identification of driver mutations, the genetic heterogeneity analysis, genomic markers discovery of drug response, pan-cancer scale analysis and so on.展开更多
Background:Multi-view-omics datasets offer rich opportunities for integrative analysis across genomic,transcriptomic,and epigenetic data platforms.Statistical methods are needed to rigorously implement current researc...Background:Multi-view-omics datasets offer rich opportunities for integrative analysis across genomic,transcriptomic,and epigenetic data platforms.Statistical methods are needed to rigorously implement current research on functional biology,matching the complex dynamics of systems genomic datasets.Methods:We apply imputation for missing data and a structural,graph-theoretic pathway model to a dataset of 22 cancers across 173 signaling pathways.Our pathway model integrates multiple data platforms,and we test for differential activation between cancerous tumor and healthy tissue populations.Results:Our pathway analysis reveals significant disturbance in signaling pathways that are known to relate to oncogenesis.We identify several pathways that suggest new research directions,including the Trk signaling and focal adhesion kinase activation pathways in sarcoma.Conclusions:Our integrative analysis confirms contemporary research findings,which supports the validity of our findings.We implement an interactive data visualization for exploration of the pathway analyses,which is available online for public access.展开更多
基金supported by the Strategic Priority Research Program of the Chinese Academy of Sciences,Stem Cell and Regenerative Medicine Research(Grant No.XDA01040405)the National High-tech R&D Program of China(863Program,2012AA022502)+1 种基金the National‘‘Twelfth FiveYear’’Plan for Science&Technology Support of China(2013BAI01B09) awarded to XFthe National Natural Science Foundation of China(Grant No.31471236)awarded to YL
文摘Publicly-accessible resources have promoted the advance of scientific discovery. The era of genomics and big data has brought the need for collaboration and data sharing in order to make effective use of this new knowledge. Here, we describe the web resources for cancer genomics research and rate them on the basis of the diversity of cancer types, sample size, omics data comprehensiveness, and user experience. The resources reviewed include data repository and analysis tools; and we hope such introduction will promote the awareness and facilitate the usage of these resources in the cancer research community.
基金supported by Strategic Priority Research Program of the Chinese Academy of Sciences[XDB38030201,XDB38030400,XDB38050300]Youth Innovation Promotion Association of Chinese Academy of Sciences[2019104]。
文摘Genome data of severe acute respiratory syndrome coronavirus 2(SARS-CoV-2)is essential for virus diagnosis,vaccine development,and variant surveillance.To archive and integrate worldwide SARS-CoV-2 genome data,a series of resources have been constructed,serving as a fundamental infrastructure for SARS-CoV-2 research,pandemic prevention and control,and coronavirus disease 2019(COVID-19)therapy.Here we present an over-view of extant SARS-CoV-2 resources that are devoted to genome data deposition and integration.We review deposition resources in data accessibility,metadata standardization,data curation and annotation;review integrative resources in data source,de-redundancy processing,data curation and quality assessment,and variant annotation.Moreover,we address issues that impede SARS-CoV-2 genome data integration,including low-complexity,inconsistency and absence of isolate name,sequence inconsistency,asynchronous update of genome data,and mismatched metadata.We finally provide insights into data standardization consensus and data submission guidelines,to promote SARS-CoV-2 genome data sharing and integration.
文摘In the past decade, the remarkable development of high-throughput sequencing technology accelerates the generation of large amount of multiple dimensional data such as genomic, epigenomic, transcriptomic and proteomic data. The comprehensive data make it possible to understand the underlying mechanisms of biology and disease such as cancer systematically. It also provides great challenges for computa- tional cancer genomics due to the complexity, scale and noise of data. In this article, we aim to review the recent develop- ments and progresses of computational models, algorithms and analysis of complex data in cancer genomics. These topics of this paper include the identification of driver mutations, the genetic heterogeneity analysis, genomic markers discovery of drug response, pan-cancer scale analysis and so on.
文摘Background:Multi-view-omics datasets offer rich opportunities for integrative analysis across genomic,transcriptomic,and epigenetic data platforms.Statistical methods are needed to rigorously implement current research on functional biology,matching the complex dynamics of systems genomic datasets.Methods:We apply imputation for missing data and a structural,graph-theoretic pathway model to a dataset of 22 cancers across 173 signaling pathways.Our pathway model integrates multiple data platforms,and we test for differential activation between cancerous tumor and healthy tissue populations.Results:Our pathway analysis reveals significant disturbance in signaling pathways that are known to relate to oncogenesis.We identify several pathways that suggest new research directions,including the Trk signaling and focal adhesion kinase activation pathways in sarcoma.Conclusions:Our integrative analysis confirms contemporary research findings,which supports the validity of our findings.We implement an interactive data visualization for exploration of the pathway analyses,which is available online for public access.