Heterotopic ossification (HO) refers to the abnormal formation of bone in soft tissue. Although some of the underlying processes of HO have been described, there are currently no clinical tests using validated bioma...Heterotopic ossification (HO) refers to the abnormal formation of bone in soft tissue. Although some of the underlying processes of HO have been described, there are currently no clinical tests using validated biomarkers for predicting HO formation. As such, the diagnosis is made radiographically after HO has formed. To identify potential and novel biomarkers for HO, we used isobaric tags for relative and absolute quantitation (iTRAQ) and high-throughput antibody arrays to produce a semi-quantitative proteomics survey of serum and tissue from subjects with (HO +) and without (HO-) heterotopic ossification. The resulting data were then analyzed using a systems biology approach. We found that serum samples from subjects experiencing traumatic injuries with resulting HO have a different proteomic expression controls. Subsequent quantitative ELISA identified profile compared to those from the matched five blood serum proteins that were differentially regulated between the HO-- and HO- groups. Compared to HO- samples, the amount of insulin-like growth factor I (IGF1) was up-regulated in HO+ samples, whereas a lower amount of osteopontin (OPN), myeloperoxidase (MPO), runt-related transcription factor 2 (RUNX2),and growth differentiation factor 2 or bone morphogenetic protein 9 (BMP-9) was found in HO + samples (Welch two sample t-test; P 〈 0.05). These proteins, in combination with potential serum biomarkers previously reported, are key candidates for a serum diagnostic panel that may enable early detection of HO prior to radiographic and clinical manifestations.展开更多
Antimicrobial resistance(AMR)poses a critical threat to global health and development,with environmental factors—particularly in urban areas—contributing significantly to the spread of antibiotic resistance genes(AR...Antimicrobial resistance(AMR)poses a critical threat to global health and development,with environmental factors—particularly in urban areas—contributing significantly to the spread of antibiotic resistance genes(ARGs).However,most research to date has been conducted at a local level,leaving significant gaps in our understanding of the global status of antibiotic resistance in urban environments.To address this issue,we thoroughly analyzed a total of 86,213 ARGs detected within 4,728 metagenome samples,which were collected by the Meta SUB International Consortium involving diverse urban environments in 60 cities of 27 countries,utilizing a deep-learning based methodology.Our findings demonstrated the strong geographical specificity of urban environmental resistome,and their correlation with various local socioeconomic and medical conditions.We also identified distinctive evolutionary patterns of ARG-related biosynthetic gene clusters(BGCs)across different countries,and discovered that the urban environment represents a rich source of novel antibiotics.Our study provides a comprehensive overview of the global urban environmental resistome,and fills a significant gap in our knowledge of large-scale urban antibiotic resistome analysis.展开更多
The revolution of biotechnology has pushed forward life sciences into the Big Data era.Particularly,high-throughput bio-techniques have greatly accelerated the integration of biology,computing and informatics,and henc...The revolution of biotechnology has pushed forward life sciences into the Big Data era.Particularly,high-throughput bio-techniques have greatly accelerated the integration of biology,computing and informatics,and hence substantially pushed forward the development of bioinformatics and computational biology.According to the latest report of data deposition within National Centre for Biotechnology Information(NCBI),the genome sequencing projects have increased 49.94%,the sequence reads generated from next generation sequencing have increased 44.37%and the protein sequences have increased 39.85%,compared with展开更多
Transcriptome reconstruction is an important application of RNA-Seq,providing critical information for further analysis of transcriptome.Although RNA-Seq offers the potential to identify the whole picture of transcrip...Transcriptome reconstruction is an important application of RNA-Seq,providing critical information for further analysis of transcriptome.Although RNA-Seq offers the potential to identify the whole picture of transcriptome,it still presents special challenges.To handle these difficulties and reconstruct transcriptome as completely as possible,current computational approaches mainly employ two strategies:de novo assembly and genome-guided assembly.In order to find the similarities and differences between them,we firstly chose five representative assemblers belonging to the two classes respectively,and then investigated and compared their algorithm features in theory and real performances in practice.We found that all the methods can be reduced to graph reduction problems,yet they have different conceptual and practical implementations,thus each assembly method has its specific advantages and disadvantages,performing worse than others in certain aspects while outperforming others in anther aspects at the same time.Finally we merged assemblies of the five assemblers and obtained a much better assembly.Additionally we evaluated an assembler using genome-guided de novo assembly approach,and achieved good performance.Based on these results,we suggest that to obtain a comprehensive set of recovered transcripts,it is better to use a combination of de novo assembly and genome-guided assembly.展开更多
Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to ob...Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to obtain accurate and comprehensive results. Here we reviewed the strategies for improving diverse transcriptomic studies and the annotation of genetic variants based on RNA-seq data. Mapping RNA-seq reads to the genome and transcriptome represent two distinct methods for quantifying the expression of genes/transcripts. Besides the known genes annotated in current databases, many novel genes/transcripts(especially those long noncoding RNAs) still can be identified on the reference genome using RNA-seq. Moreover, owing to the incompleteness of current reference genomes, some novel genes are missing from them. Genome-guided and de novo transcriptome reconstruction are two effective and complementary strategies for identifying those novel genes/transcripts on or beyond the reference genome. In addition, integrating the genes of distinct databases to conduct transcriptomics and genetics studies can improve the results of corresponding analyses.展开更多
RNA-Seq technology is becoming widely used in various transcriptomics studies;however,analyzing and interpreting the RNA-Seq data face serious challenges.With the development of high-throughput sequencing technologies...RNA-Seq technology is becoming widely used in various transcriptomics studies;however,analyzing and interpreting the RNA-Seq data face serious challenges.With the development of high-throughput sequencing technologies,the sequencing cost is dropping dramatically with the sequencing output increasing sharply.However,the sequencing reads are still short in length and contain various sequencing errors.Moreover,the intricate transcriptome is always more complicated than we expect.These challenges proffer the urgent need of efficient bioinformatics algorithms to effectively handle the large amount of transcriptome sequencing data and carry out diverse related studies.This review summarizes a number of frequently-used applications of transcriptome sequencing and their related analyzing strategies,including short read mapping,exon-exon splice junction detection,gene or isoform expression quantification,differential expression analysis and transcriptome reconstruction.展开更多
In the past several years,next-generation sequencing(NGS) technologies have greatly revolutionized our approaches to explore and depict the characteristics and functions of the genomes for various species.The NGS tech...In the past several years,next-generation sequencing(NGS) technologies have greatly revolutionized our approaches to explore and depict the characteristics and functions of the genomes for various species.The NGS technologies have been broadly used in diverse fields including genomics(genome sequencing and exome sequencing) [1,2],transcriptomics(RNA-Seq) [3,4] and epigenomics(ChIP-Seq,展开更多
Characterized by their low prevalence, rare diseases are often chronically debilitating or life threatening. Despite their low prevalence, the aggregate number of individuals suffering from a rare disease is estimated...Characterized by their low prevalence, rare diseases are often chronically debilitating or life threatening. Despite their low prevalence, the aggregate number of individuals suffering from a rare disease is estimated to be nearly 400 million worldwide.Over the past decades, efforts from researchers, clinicians, and pharmaceutical industries have been focused on both the diagnosis and therapy of rare diseases. However, because of the lack of data and medical records for individual rare diseases and the high cost of orphan drug development, only limited progress has been achieved. In recent years, the rapid development of next-generation sequencing(NGS)-based technologies, as well as the popularity of precision medicine has facilitated a better understanding of rare diseases and their molecular etiology. As a result, molecular subclassification can be identified within each disease more clearly, significantly improving diagnostic accuracy. However, providing appropriate care for patients with rare diseases is still an enormous challenge. In this review, we provide a brief introduction to the challenges of rare disease research and make suggestions on where and how our efforts should be focused.展开更多
Rare diseases are chronic and serious,featuring early onset at birth or in childhood,rapid deterioration and high mortality rate,which creates a burden on society and public health systems.Of the known rare diseases,8...Rare diseases are chronic and serious,featuring early onset at birth or in childhood,rapid deterioration and high mortality rate,which creates a burden on society and public health systems.Of the known rare diseases,80 percent are genetic in origin,and half of those affected worldwide are children.In China,the rare disease patients are over 10 million,and70 percent of the patients are children(Song et al.,2012;Liu et al.,2010).展开更多
Besides upper tract urothelial cell carcinoma(UTUCs),a recent study published in Science Translational Medicine has indicated that liver cancer may be associated with the exposure of aristolochic acids and similar der...Besides upper tract urothelial cell carcinoma(UTUCs),a recent study published in Science Translational Medicine has indicated that liver cancer may be associated with the exposure of aristolochic acids and similar derivatives(collectively,AA).However,according to our research,this study needs more number of samples for further verification which should be sampled from a wider range of people.展开更多
Different psychiatric disorders share genetic relationships and pleiotropic loci to certain extent.We integrated and analyzed datasets related to major depressive disorder(MDD),bipolar disorder(BIP),and schizophrenia(...Different psychiatric disorders share genetic relationships and pleiotropic loci to certain extent.We integrated and analyzed datasets related to major depressive disorder(MDD),bipolar disorder(BIP),and schizophrenia(SCZ)from the Psychiatric Genomics Consortium using multitrait analysis of genome-wide association analysis(MTAG).MTAG significantly increased the effective sample size from 99,773 to 119,754 for MDD,from 909,061 to 1,450,972 for BIP,and from 856,677 to 940,613 for SCZ.We discovered 7,32,and 43 novel lead single nucleotide polymorphisms(SNPs)and 1,6,and 3 novel causal SNPs for MDD,BIP,and SCZ,respectively,after fine-mapping.We identified rs8039305 in the FURIN gene as a novel pleiotropic locus across the three disorders.We performed marker analysis of genomic annotation(MAGMA)and Hi-C-coupled MAGMA(H-MAGMA)based gene-set analysis and identified 101 genes associated with the three disorders,which were enriched in the regulation of postsynaptic membranes,postsynaptic membrane dopaminergic synapses,and Notch signaling pathway.Next,we performed Mendelian randomization analysis using different tools and detected a causal effect of BIP on SCZ.Overall,we demonstrated the usage of combined genome-wide association studies summary statistics for exploring potential novel mechanisms of the three psychiatric disorders,providing an alternative approach to integrate publicly available summary data.展开更多
The most popular CRISPR-SpCas9 systemrecognizes canonical NGG protospacer adjacent motifs(PAMs).Previously engineered SpCas9 variants,such as Cas9-NG,favor G-rich PAMs in genome editing.In this manuscript,we describe ...The most popular CRISPR-SpCas9 systemrecognizes canonical NGG protospacer adjacent motifs(PAMs).Previously engineered SpCas9 variants,such as Cas9-NG,favor G-rich PAMs in genome editing.In this manuscript,we describe a new plant genome-editing system based on a hybrid iSpyMacCas9 platform that allows for targeted mutagenesis,C to T base editing,and A to G base editing at A-rich PAMs.This study fills amajor technology gap in the CRISPR-Cas9 system for editing NAAR PAMs in plants,which greatly expands the targeting scope of CRISPR-Cas9.Finally,our vector systems are fully compatible with Gateway cloning and will work with all existing single-guide RNA expression systems,facilitating easy adoption of the systems by others.We anticipate that more tools,such as prime editing,homology-directed repair,CRISPR interference,and CRISPR activation,will be further developed based on our promising iSpyMac-Cas9 platform.展开更多
Human and mouse orthologs are expected to have similar biological functions; however, many discrepancies have also been reported. We systematically compared human and mouse orthologs in terms of alternative splicing p...Human and mouse orthologs are expected to have similar biological functions; however, many discrepancies have also been reported. We systematically compared human and mouse orthologs in terms of alternative splicing patterns and expression profiles. Human-mouse orthologs are divergent in alternative splicing, as human orthologs could generally encode more isoforms than their mouse orthologs. In early embryos, exon skipping is far more common with human orthologs, whereas constitutive exons are more prevalent with mouse orthologs. This may correlate with divergence in expression of splicing regulators. Orthologous expression similarities are different in distinct embryonic stages, with the highest in morula. Expression differences for orthologous transcription factor genes could play an important role in orthologous expression discordance. We further detected largely orthologous divergence in differential expression between distinct embryonic stages. Collectively, our study uncovers significant orthologous divergence from multiple aspects, which may result in functional differences and dynamics between human-mouse orthologs during embryonic development.展开更多
While precision medicine driven by genome sequencing has revolutionized cancer care,such as lung cancer,its impact on gastric cancer(GC)has been minimal.GC patients are routinely treated with chemotherapy,but only a f...While precision medicine driven by genome sequencing has revolutionized cancer care,such as lung cancer,its impact on gastric cancer(GC)has been minimal.GC patients are routinely treated with chemotherapy,but only a fraction of them receive the clinical benefit.There is an urgent need to develop biomarkers or algorithms to select chemo-sensitive patients or apply targeted therapy.Here,we carried out retrospective analyses of 1,020 formalin-fixed,paraffin-embedded GC surgical resection samples from 5 hospitals and developed a mass spectrometry-based workflow for proteomic subtyping of GC.We identified two proteomic subtypes:the chemo-sensitive group(CSG)and the chemo-insensitive group(CIG)in the discovery set.The 5-year overall survival of CSG was significantly improved in patients who had received adjuvant chemotherapy after surgery compared with those who received surgery only(64.2%vs.49.6%;Cox P-value=0.002),whereas no such improvement was observed in CIG(50.0%vs.58.6%;Cox P-value=0.495).We validated these results in an independent validation set.Further,differential proteome analysis uncovered 9 FDA-approved drugs that may be applicable for targeted therapy of GC.A prospective study is warranted to test these findings for future GC patient care.展开更多
Bifunctional RNAs that possess both protein-coding and noncoding functional properties were less explored and poorly understood. Here we systematically explored the characteristics and functions of such human bifuncti...Bifunctional RNAs that possess both protein-coding and noncoding functional properties were less explored and poorly understood. Here we systematically explored the characteristics and functions of such human bifunctional RNAs by integrating tandem mass spectrometry and RNA-seq data. We first constructed a pipeline to identify and annotate bifunctional RNAs,leading to the characterization of 132 high-confidence bifunctional RNAs. Our analyses indicate that bifunctional RNAs may be involved in human embryonic development and can be functional in diverse tissues. Moreover, bifunctional RNAs could interact with multiple miRNAs and RNA-binding proteins to exert their corresponding roles. Bifunctional RNAs may also function as competing endogenous RNAs to regulate the expression of many genes by competing for common targeting miRNAs. Finally,somatic mutations of diverse carcinomas may generate harmful effect on corresponding bifunctional RNAs. Collectively,our study not only provides the pipeline for identifying and annotating bifunctional RNAs but also reveals their important gene-regulatory functions.展开更多
High-throughput next generation sequencing (NGS) is a shotgun approach applied in a parallel fashion by which the genome is fragmented and sequenced through small pieces and then analyzed either by aligning to a known...High-throughput next generation sequencing (NGS) is a shotgun approach applied in a parallel fashion by which the genome is fragmented and sequenced through small pieces and then analyzed either by aligning to a known reference genome or by de novo assembly without reference genome.This technology has led researchers to conduct an explosion of sequencing related projects in multidisciplinary fields of science.However,due to the limitations of sequencing-based chemistry,length of sequencing reads and the complexity of genes,it is difficult to determine the sequences of some portions of the human genome,leaving gaps in genomic data that frustrate further analysis.Particularly,some complex genes are difficult to be accurately sequenced or mapped because they contain high GC-content and/or low complexity regions,and complicated pseudogenes,such as the genes encoding xenobiotic metabolizing enzymes and transporters (XMETs).The genetic variants in XMET genes are critical to predicate interindividual variability in drug efficacy,drug safety and susceptibility to environmental toxicity.We summarized and discussed challenges,wet-lab methods,and bioinformatics algorithms in sequencing "complex" XMET genes,which may provide insightful information in the application of NGS technology for implementation in toxicogenomics and pharmacogenomics.展开更多
RNA sequencing(RNA-seq) has greatly facilitated the exploring of transcriptome landscape for diverse organisms.However,transcriptome reconstruction is still challenging due to various limitations of current tools and ...RNA sequencing(RNA-seq) has greatly facilitated the exploring of transcriptome landscape for diverse organisms.However,transcriptome reconstruction is still challenging due to various limitations of current tools and sequencing technologies.Here,we introduce an efficient tool,QuaPra(Quadratic Programming combined with Apriori),for accurate transcriptome assembly and quantification.QuaPra could detect at least 26.5% more low abundance(0.1–1 FPKM) transcripts with over 2.7% increase of sensitivity and precision on simulated data compared to other currently popular tools.Moreover,around one-quarter more known transcripts were correctly assembled by QuaPra than other assemblers on real sequencing data.QuaPra is freely available at http://www.megabionet.org/QuaPra/.展开更多
Dear Editor,An increasing number of single-cell RNA-seq(scRNA-seq)studies gained deep insights into the gene expression heterogeneity among individual cells by cell type/state identification(Chen et al.,2019).Alternat...Dear Editor,An increasing number of single-cell RNA-seq(scRNA-seq)studies gained deep insights into the gene expression heterogeneity among individual cells by cell type/state identification(Chen et al.,2019).Alternative splicing(AS)enables genes to generate multiple isoforms through the combination of disparate exons to increase the diversity of transcriptome(Lee and Rio,2015).展开更多
The past decades have witnessed a rapid development in the pediatric field along with the development of medical sciences in China.However,the increasing demand for pediatric healthcare services still cannot be met ow...The past decades have witnessed a rapid development in the pediatric field along with the development of medical sciences in China.However,the increasing demand for pediatric healthcare services still cannot be met owing to various reasons.The shortage of pediatric medical resources and the limited access to medical care for pediatric patients have long been the priorities of healthcare reform in China.展开更多
基金supported by the Department of Defense (Grant No. W81-WXH-10-20139 to LEE as Co-PI)the National Institute of General Medical Sciences of the National Institutes of Health [Grant No. U54-GM104941 (DE-CTR) to ELC]the Nemours Alfred I. du Pont Hospital for Children’s Biomedical Research Department, United States to ELC
文摘Heterotopic ossification (HO) refers to the abnormal formation of bone in soft tissue. Although some of the underlying processes of HO have been described, there are currently no clinical tests using validated biomarkers for predicting HO formation. As such, the diagnosis is made radiographically after HO has formed. To identify potential and novel biomarkers for HO, we used isobaric tags for relative and absolute quantitation (iTRAQ) and high-throughput antibody arrays to produce a semi-quantitative proteomics survey of serum and tissue from subjects with (HO +) and without (HO-) heterotopic ossification. The resulting data were then analyzed using a systems biology approach. We found that serum samples from subjects experiencing traumatic injuries with resulting HO have a different proteomic expression controls. Subsequent quantitative ELISA identified profile compared to those from the matched five blood serum proteins that were differentially regulated between the HO-- and HO- groups. Compared to HO- samples, the amount of insulin-like growth factor I (IGF1) was up-regulated in HO+ samples, whereas a lower amount of osteopontin (OPN), myeloperoxidase (MPO), runt-related transcription factor 2 (RUNX2),and growth differentiation factor 2 or bone morphogenetic protein 9 (BMP-9) was found in HO + samples (Welch two sample t-test; P 〈 0.05). These proteins, in combination with potential serum biomarkers previously reported, are key candidates for a serum diagnostic panel that may enable early detection of HO prior to radiographic and clinical manifestations.
基金supported by the National Key Research and Development Program of China(2023YFC2706503)the National Natural Science Foundation of China(32370720)+9 种基金Beihang University&Capital Medical University Plan(BHME-201904)the Open Research Fund of Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE,ECNU,Key Laboratory of MEA,Ministry of Education,ECNU,Key Laboratory of Ecology and Energy Saving Study of Dense Habitat(Tongji University),Ministry of Education-Shanghai Tongji Urban Planning&Design Institute Co.,Ltd Joint Research Project(KY-2022-LH-A03)Shanghai Tongji Urban Planning&Design Institute Co.,Ltd-China Intelligent Urbanization Co-creation Center for High Density Region Research Project(KY-2022-PT-A02)the Irma T.Hirschl and Monique Weill-Caulier Charitable TrustsBert L and N Kuggie Vallee Foundationthe World Quant FoundationThe Pershing Square Sohn Cancer Research Alliancethe National Institutes of Health(R01AI151059)the National Science Foundation(1840275)the Alfred P.Sloan Foundation(G-2015-13964)。
文摘Antimicrobial resistance(AMR)poses a critical threat to global health and development,with environmental factors—particularly in urban areas—contributing significantly to the spread of antibiotic resistance genes(ARGs).However,most research to date has been conducted at a local level,leaving significant gaps in our understanding of the global status of antibiotic resistance in urban environments.To address this issue,we thoroughly analyzed a total of 86,213 ARGs detected within 4,728 metagenome samples,which were collected by the Meta SUB International Consortium involving diverse urban environments in 60 cities of 27 countries,utilizing a deep-learning based methodology.Our findings demonstrated the strong geographical specificity of urban environmental resistome,and their correlation with various local socioeconomic and medical conditions.We also identified distinctive evolutionary patterns of ARG-related biosynthetic gene clusters(BGCs)across different countries,and discovered that the urban environment represents a rich source of novel antibiotics.Our study provides a comprehensive overview of the global urban environmental resistome,and fills a significant gap in our knowledge of large-scale urban antibiotic resistome analysis.
文摘The revolution of biotechnology has pushed forward life sciences into the Big Data era.Particularly,high-throughput bio-techniques have greatly accelerated the integration of biology,computing and informatics,and hence substantially pushed forward the development of bioinformatics and computational biology.According to the latest report of data deposition within National Centre for Biotechnology Information(NCBI),the genome sequencing projects have increased 49.94%,the sequence reads generated from next generation sequencing have increased 44.37%and the protein sequences have increased 39.85%,compared with
基金supported by the National Basic Research Program of China (2010CB945401)the National Natural Science Foundation of China (31240038, 31171264, 31071162, 31000590)the Science and Technology Commission of Shanghai Municipality (11DZ2260300)
文摘Transcriptome reconstruction is an important application of RNA-Seq,providing critical information for further analysis of transcriptome.Although RNA-Seq offers the potential to identify the whole picture of transcriptome,it still presents special challenges.To handle these difficulties and reconstruct transcriptome as completely as possible,current computational approaches mainly employ two strategies:de novo assembly and genome-guided assembly.In order to find the similarities and differences between them,we firstly chose five representative assemblers belonging to the two classes respectively,and then investigated and compared their algorithm features in theory and real performances in practice.We found that all the methods can be reduced to graph reduction problems,yet they have different conceptual and practical implementations,thus each assembly method has its specific advantages and disadvantages,performing worse than others in certain aspects while outperforming others in anther aspects at the same time.Finally we merged assemblies of the five assemblers and obtained a much better assembly.Additionally we evaluated an assembler using genome-guided de novo assembly approach,and achieved good performance.Based on these results,we suggest that to obtain a comprehensive set of recovered transcripts,it is better to use a combination of de novo assembly and genome-guided assembly.
基金Supplementary information is linked to the online version of the paper on the Cell Research website. Acknowledgments We are very thankful to Dr Dusan M Jeftinija (Department of Neurosience & Anatomy, University of Louisville, Kentucky, USA) for his help during the manuscript preparation. This work was supported by grants from the State Key Program of Basic Research of China (2007CB108800, 2009CB918400, 2010CB912102), the Hi-Tech Research and Development Program of China (2006AA02Z313), National Natural Science Foundation of China (30870575) and Science and Technology Commission of Shanghai Municipality (06DZ22923).
基金supported by the National High Technology Research and Development Program of China(2015AA020104)the China Human Proteome Project(2014DFB30010)+1 种基金the National Science Foundation of China(31471239,to Leming Shi)the 111 Project(B13016)
文摘Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to obtain accurate and comprehensive results. Here we reviewed the strategies for improving diverse transcriptomic studies and the annotation of genetic variants based on RNA-seq data. Mapping RNA-seq reads to the genome and transcriptome represent two distinct methods for quantifying the expression of genes/transcripts. Besides the known genes annotated in current databases, many novel genes/transcripts(especially those long noncoding RNAs) still can be identified on the reference genome using RNA-seq. Moreover, owing to the incompleteness of current reference genomes, some novel genes are missing from them. Genome-guided and de novo transcriptome reconstruction are two effective and complementary strategies for identifying those novel genes/transcripts on or beyond the reference genome. In addition, integrating the genes of distinct databases to conduct transcriptomics and genetics studies can improve the results of corresponding analyses.
基金supported by the National Basic Research Program of China (Grant Nos. 2010CB945401, 2007CB108800)National Natural Science Foundation of China (Grant Nos. 30870575,31071162,31000590)Science and Technology Commission of Shanghai Municipality (Grant No. 11DZ2260300)
文摘RNA-Seq technology is becoming widely used in various transcriptomics studies;however,analyzing and interpreting the RNA-Seq data face serious challenges.With the development of high-throughput sequencing technologies,the sequencing cost is dropping dramatically with the sequencing output increasing sharply.However,the sequencing reads are still short in length and contain various sequencing errors.Moreover,the intricate transcriptome is always more complicated than we expect.These challenges proffer the urgent need of efficient bioinformatics algorithms to effectively handle the large amount of transcriptome sequencing data and carry out diverse related studies.This review summarizes a number of frequently-used applications of transcriptome sequencing and their related analyzing strategies,including short read mapping,exon-exon splice junction detection,gene or isoform expression quantification,differential expression analysis and transcriptome reconstruction.
基金supported by the National Basic Research Program of China (2010CB945401)the National Natural Science Foundation of China (31240038)Graduate School of East China Normal University
文摘In the past several years,next-generation sequencing(NGS) technologies have greatly revolutionized our approaches to explore and depict the characteristics and functions of the genomes for various species.The NGS technologies have been broadly used in diverse fields including genomics(genome sequencing and exome sequencing) [1,2],transcriptomics(RNA-Seq) [3,4] and epigenomics(ChIP-Seq,
基金supported by the National High Technology Research and Development Program of China (2015AA020108, 2015AA020104)the National Science Foundation of China (31671377)Shanghai 111 Project (B14019)
文摘Characterized by their low prevalence, rare diseases are often chronically debilitating or life threatening. Despite their low prevalence, the aggregate number of individuals suffering from a rare disease is estimated to be nearly 400 million worldwide.Over the past decades, efforts from researchers, clinicians, and pharmaceutical industries have been focused on both the diagnosis and therapy of rare diseases. However, because of the lack of data and medical records for individual rare diseases and the high cost of orphan drug development, only limited progress has been achieved. In recent years, the rapid development of next-generation sequencing(NGS)-based technologies, as well as the popularity of precision medicine has facilitated a better understanding of rare diseases and their molecular etiology. As a result, molecular subclassification can be identified within each disease more clearly, significantly improving diagnostic accuracy. However, providing appropriate care for patients with rare diseases is still an enormous challenge. In this review, we provide a brief introduction to the challenges of rare disease research and make suggestions on where and how our efforts should be focused.
文摘Rare diseases are chronic and serious,featuring early onset at birth or in childhood,rapid deterioration and high mortality rate,which creates a burden on society and public health systems.Of the known rare diseases,80 percent are genetic in origin,and half of those affected worldwide are children.In China,the rare disease patients are over 10 million,and70 percent of the patients are children(Song et al.,2012;Liu et al.,2010).
基金supported by National Key Research and Development Program of China(2015AA020108)National Natural Science Foundation of China(31671377,31771460)Shanghai 111Project(B14019)
文摘Besides upper tract urothelial cell carcinoma(UTUCs),a recent study published in Science Translational Medicine has indicated that liver cancer may be associated with the exposure of aristolochic acids and similar derivatives(collectively,AA).However,according to our research,this study needs more number of samples for further verification which should be sampled from a wider range of people.
基金supported by the National Key Research and Development Program of China(2015AA020108)the National Natural Science Foundation of China(31671377,81671326)+3 种基金Shanghai Municipal Science and Technology Major Project(2017SHZDZX01)Open Research Fund of Key Laboratory of Advanced Theory and Application in Statistics and Data Science(East China Normal University)of Ministry of Educationthe Fundamental Research Funds for the Central Universities,Beihang University&Capital Medical University Advanced Innovation Center for Big Data-Based Precision Medicine Plan(BHME-201804,BHME-201904)The Special Fund of the Pediatric Medical Coordinated Development Center of Beijing Hospitals。
文摘Different psychiatric disorders share genetic relationships and pleiotropic loci to certain extent.We integrated and analyzed datasets related to major depressive disorder(MDD),bipolar disorder(BIP),and schizophrenia(SCZ)from the Psychiatric Genomics Consortium using multitrait analysis of genome-wide association analysis(MTAG).MTAG significantly increased the effective sample size from 99,773 to 119,754 for MDD,from 909,061 to 1,450,972 for BIP,and from 856,677 to 940,613 for SCZ.We discovered 7,32,and 43 novel lead single nucleotide polymorphisms(SNPs)and 1,6,and 3 novel causal SNPs for MDD,BIP,and SCZ,respectively,after fine-mapping.We identified rs8039305 in the FURIN gene as a novel pleiotropic locus across the three disorders.We performed marker analysis of genomic annotation(MAGMA)and Hi-C-coupled MAGMA(H-MAGMA)based gene-set analysis and identified 101 genes associated with the three disorders,which were enriched in the regulation of postsynaptic membranes,postsynaptic membrane dopaminergic synapses,and Notch signaling pathway.Next,we performed Mendelian randomization analysis using different tools and detected a causal effect of BIP on SCZ.Overall,we demonstrated the usage of combined genome-wide association studies summary statistics for exploring potential novel mechanisms of the three psychiatric disorders,providing an alternative approach to integrate publicly available summary data.
基金supported by startup funds from the University of Maryland,the National Science Foundation Plant Genome Research Program grant(award no.IOS-1758745)the Biotechnology Risk Assessment Grant Program competitive grant(award no.2018-33522-28789)from the U.S.Department of Agriculture.
文摘The most popular CRISPR-SpCas9 systemrecognizes canonical NGG protospacer adjacent motifs(PAMs).Previously engineered SpCas9 variants,such as Cas9-NG,favor G-rich PAMs in genome editing.In this manuscript,we describe a new plant genome-editing system based on a hybrid iSpyMacCas9 platform that allows for targeted mutagenesis,C to T base editing,and A to G base editing at A-rich PAMs.This study fills amajor technology gap in the CRISPR-Cas9 system for editing NAAR PAMs in plants,which greatly expands the targeting scope of CRISPR-Cas9.Finally,our vector systems are fully compatible with Gateway cloning and will work with all existing single-guide RNA expression systems,facilitating easy adoption of the systems by others.We anticipate that more tools,such as prime editing,homology-directed repair,CRISPR interference,and CRISPR activation,will be further developed based on our promising iSpyMac-Cas9 platform.
基金supported by the China Human Proteomics Project (2014DFB30010)the National High Technology Research and Development Program of China (2015AA020104)+1 种基金the National Natural Science Foundation of China (31071162)the Graduate School of East China Normal University
文摘Human and mouse orthologs are expected to have similar biological functions; however, many discrepancies have also been reported. We systematically compared human and mouse orthologs in terms of alternative splicing patterns and expression profiles. Human-mouse orthologs are divergent in alternative splicing, as human orthologs could generally encode more isoforms than their mouse orthologs. In early embryos, exon skipping is far more common with human orthologs, whereas constitutive exons are more prevalent with mouse orthologs. This may correlate with divergence in expression of splicing regulators. Orthologous expression similarities are different in distinct embryonic stages, with the highest in morula. Expression differences for orthologous transcription factor genes could play an important role in orthologous expression discordance. We further detected largely orthologous divergence in differential expression between distinct embryonic stages. Collectively, our study uncovers significant orthologous divergence from multiple aspects, which may result in functional differences and dynamics between human-mouse orthologs during embryonic development.
基金supported by the National Key Research and Development Program of China(2017YFC1308900,2017YFC0908404,2018YFA0507503,2017YFA0505103)Beijing Municipal Government Key Research and Development Program(Z181100001918020,Z161100002616036)+4 种基金the National Natural Science Foundation of China(31870828,81972790,81672319)the Guangdong Provincial Key R&D Programmes(2019B020229002)the Science and Technology Program of Guangzhou(201902020009)the National Key Basic Research Program of China(2014CBA02002)the National Key Technology Support Program(2015BAI13B07).
文摘While precision medicine driven by genome sequencing has revolutionized cancer care,such as lung cancer,its impact on gastric cancer(GC)has been minimal.GC patients are routinely treated with chemotherapy,but only a fraction of them receive the clinical benefit.There is an urgent need to develop biomarkers or algorithms to select chemo-sensitive patients or apply targeted therapy.Here,we carried out retrospective analyses of 1,020 formalin-fixed,paraffin-embedded GC surgical resection samples from 5 hospitals and developed a mass spectrometry-based workflow for proteomic subtyping of GC.We identified two proteomic subtypes:the chemo-sensitive group(CSG)and the chemo-insensitive group(CIG)in the discovery set.The 5-year overall survival of CSG was significantly improved in patients who had received adjuvant chemotherapy after surgery compared with those who received surgery only(64.2%vs.49.6%;Cox P-value=0.002),whereas no such improvement was observed in CIG(50.0%vs.58.6%;Cox P-value=0.495).We validated these results in an independent validation set.Further,differential proteome analysis uncovered 9 FDA-approved drugs that may be applicable for targeted therapy of GC.A prospective study is warranted to test these findings for future GC patient care.
基金supported in part by the National High Technology Research and Development Program of China(2015AA020104,2015AA020108)the China Human Proteomics Project(2014DF30030)the National Science Foundation of China(31471239)
文摘Bifunctional RNAs that possess both protein-coding and noncoding functional properties were less explored and poorly understood. Here we systematically explored the characteristics and functions of such human bifunctional RNAs by integrating tandem mass spectrometry and RNA-seq data. We first constructed a pipeline to identify and annotate bifunctional RNAs,leading to the characterization of 132 high-confidence bifunctional RNAs. Our analyses indicate that bifunctional RNAs may be involved in human embryonic development and can be functional in diverse tissues. Moreover, bifunctional RNAs could interact with multiple miRNAs and RNA-binding proteins to exert their corresponding roles. Bifunctional RNAs may also function as competing endogenous RNAs to regulate the expression of many genes by competing for common targeting miRNAs. Finally,somatic mutations of diverse carcinomas may generate harmful effect on corresponding bifunctional RNAs. Collectively,our study not only provides the pipeline for identifying and annotating bifunctional RNAs but also reveals their important gene-regulatory functions.
基金supported by the FDA Project(E0765001)the National Key Research and Development Program of China(2016YFC0902100 to Geng Chen)
文摘High-throughput next generation sequencing (NGS) is a shotgun approach applied in a parallel fashion by which the genome is fragmented and sequenced through small pieces and then analyzed either by aligning to a known reference genome or by de novo assembly without reference genome.This technology has led researchers to conduct an explosion of sequencing related projects in multidisciplinary fields of science.However,due to the limitations of sequencing-based chemistry,length of sequencing reads and the complexity of genes,it is difficult to determine the sequences of some portions of the human genome,leaving gaps in genomic data that frustrate further analysis.Particularly,some complex genes are difficult to be accurately sequenced or mapped because they contain high GC-content and/or low complexity regions,and complicated pseudogenes,such as the genes encoding xenobiotic metabolizing enzymes and transporters (XMETs).The genetic variants in XMET genes are critical to predicate interindividual variability in drug efficacy,drug safety and susceptibility to environmental toxicity.We summarized and discussed challenges,wet-lab methods,and bioinformatics algorithms in sequencing "complex" XMET genes,which may provide insightful information in the application of NGS technology for implementation in toxicogenomics and pharmacogenomics.
基金supported by the National High Technology Research and Development Program of China(2015AA020108)the National Key Research and Development Program of China(2016YFC0902100)+2 种基金the China Human Proteome Project(2014DFB30010,2014DFB30030)the National Science Foundation of China(31671377,31401133,31771460,91629103)the Program of Introducing Talents of Discipline to Universities of China(B14019)
文摘RNA sequencing(RNA-seq) has greatly facilitated the exploring of transcriptome landscape for diverse organisms.However,transcriptome reconstruction is still challenging due to various limitations of current tools and sequencing technologies.Here,we introduce an efficient tool,QuaPra(Quadratic Programming combined with Apriori),for accurate transcriptome assembly and quantification.QuaPra could detect at least 26.5% more low abundance(0.1–1 FPKM) transcripts with over 2.7% increase of sensitivity and precision on simulated data compared to other currently popular tools.Moreover,around one-quarter more known transcripts were correctly assembled by QuaPra than other assemblers on real sequencing data.QuaPra is freely available at http://www.megabionet.org/QuaPra/.
基金supported by the National Natural Science Foundation of China(31771460,31671377)the National Key Research and Development Program of China(2016YFC0902100)。
文摘Dear Editor,An increasing number of single-cell RNA-seq(scRNA-seq)studies gained deep insights into the gene expression heterogeneity among individual cells by cell type/state identification(Chen et al.,2019).Alternative splicing(AS)enables genes to generate multiple isoforms through the combination of disparate exons to increase the diversity of transcriptome(Lee and Rio,2015).
文摘The past decades have witnessed a rapid development in the pediatric field along with the development of medical sciences in China.However,the increasing demand for pediatric healthcare services still cannot be met owing to various reasons.The shortage of pediatric medical resources and the limited access to medical care for pediatric patients have long been the priorities of healthcare reform in China.