<div style="text-align:justify;"> In view of the complex problems that freight train ATO (automatic train operation) needs to comprehensively consider punctuality, energy saving and safety, a dynamics ...<div style="text-align:justify;"> In view of the complex problems that freight train ATO (automatic train operation) needs to comprehensively consider punctuality, energy saving and safety, a dynamics model of the freight train operation process is established based on the safety and the freight train dynamics model in the process of its operation. The algorithm of combining elite competition strategy with multi-objective particle swarm optimization technology is introduced, and the winning particles are obtained through the competition between two elite particles to guide the update of other particles, so as to balance the convergence and distribution of multi-objective particle swarm optimization. The performance comparison experimental results verify the superiority of the proposed algorithm. The simulation experiments of the actual line verify the feasibility of the model and the effectiveness of the proposed algorithm. </div>展开更多
<div style="text-align:justify;"> Load identification method is one of the major technical difficulties of non-intrusive composite monitoring. Binary V-I trajectory image can reflect the original V-I t...<div style="text-align:justify;"> Load identification method is one of the major technical difficulties of non-intrusive composite monitoring. Binary V-I trajectory image can reflect the original V-I trajectory characteristics to a large extent, so it is widely used in load identification. However, using single binary V-I trajectory feature for load identification has certain limitations. In order to improve the accuracy of load identification, the power feature is added on the basis of the binary V-I trajectory feature in this paper. We change the initial binary V-I trajectory into a new 3D feature by mapping the power feature to the third dimension. In order to reduce the impact of imbalance samples on load identification, the SVM SMOTE algorithm is used to balance the samples. Based on the deep learning method, the convolutional neural network model is used to extract the newly produced 3D feature to achieve load identification in this paper. The results indicate the new 3D feature has better observability and the proposed model has higher identification performance compared with other classification models on the public data set PLAID. </div>展开更多
OSTEOKINES IN INTER-ORGAN COMMUNICATIONS The concept of“organic wholeness”permeates all the fields of traditional Chinese medicine,which is also widely accepted by modern medicine.The wealth of knowledge regarding i...OSTEOKINES IN INTER-ORGAN COMMUNICATIONS The concept of“organic wholeness”permeates all the fields of traditional Chinese medicine,which is also widely accepted by modern medicine.The wealth of knowledge regarding inter-organ communications generated in the latest decades provided solid evidence that human physiology and pathophysiology involve systematic interactions between multiple organs or tissues.Therefore,advancing our understanding of the inter-organ communication process is believed to provide a new perspective for treating various human diseases.展开更多
Hepatitis B virus(HBV),one of the well-known DNA oncogenic viruses,is the leading cause of hepatocellular carcinoma(HCC).In infected hepatocytes,HBV DNA can be integrated into the host genome through an insertional mu...Hepatitis B virus(HBV),one of the well-known DNA oncogenic viruses,is the leading cause of hepatocellular carcinoma(HCC).In infected hepatocytes,HBV DNA can be integrated into the host genome through an insertional mutagenesis process inducing tumorigenesis.Dissection of the genomic features surrounding integration sites will deepen our understanding of mechanisms underlying integration.Moreover,the quantity and biological activity of integration sites may reflect the DNA damage within affected cells or the potential survival benefits they may confer.The wellknown human genomic features include repeat elements,particular regions(such as telomeres),and frequently interrupted genes(e.g.,telomerase reverse transcriptase[i.e.TERT],lysine methyltransferase 2B[i.e.KMT2B],cyclin E1[CCNE1],and cyclin A2[CCNA2]).Consequently,distinct genomic features within diverse integrations differentiate their biological functions.Meanwhile,accumulating evidence has shown that viral proteins produced by integrants may cause cell damage even after the suppression of HBV replication.The integration-derived gene products can also serve as tumor markers,promoting the development of novel therapeutic strategies for HCC.Viral integrants can be single copy or multiple copies of different fragments with complicated rearrangement,which warrants elucidation of the whole viral integrant arrangement in future studies.All of these considerations underlie an urgent need to develop novel methodology and technology for sequence characterization and function evaluation of integration events in chronic hepatitis B-associated disease progression by monitoring both host genomic features and viral integrants.This endeavor may also serve as a promising solution for evaluating the risk of tumorigenesis and as a companion diagnostic for designing therapeutic strategies targeting integration-related disease complications.展开更多
The Streptococcus suis serotype 2(S. suis 2) isolates 05ZYH33 and 98HAH33 have caused severe human infections in China. Using a strand-specific RNA-seq analysis, we compared the in vitro transcriptomes of these two ...The Streptococcus suis serotype 2(S. suis 2) isolates 05ZYH33 and 98HAH33 have caused severe human infections in China. Using a strand-specific RNA-seq analysis, we compared the in vitro transcriptomes of these two Chinese isolates with that of a reference strain(P1/7). In the89 K genomic island that is specific to these Chinese isolates, a toxin–antitoxin system showed relatively high levels of transcription among the S. suis. The known virulence factors with high transcriptional activity in these two highly-pathogenic strains are mainly involved in adhesion, biofilm formation, hemolysis and the synthesis and transport of the outer membrane protein. Furthermore,our analysis of novel transcripts identified over 50 protein-coding genes with one of them encoding a toxin protein. We also predicted over 30 small RNAs(s RNAs) in each strain, and most of them are involved in riboswitches. We found that six s RNA candidates that are related to bacterial virulence, including csp A and rli38, are specific to Chinese isolates. These results provide insight into the factors responsible for the difference in virulence among the different S. suis 2 isolates.展开更多
The Human Genome Project opened an era of(epi)genomic research,and also provided a platform for the development of new sequencing technologies.During and after the project,several sequencing technologies continue to d...The Human Genome Project opened an era of(epi)genomic research,and also provided a platform for the development of new sequencing technologies.During and after the project,several sequencing technologies continue to dominate nucleic acid sequencing markets.Currently,Illumina(short-read),PacBio(long-read),and Oxford Nanopore(longread)are the most popular sequencing technologies.Unlike PacBio or the popular short-read sequencers before it,which,as examples of the second or so-called Next-Generation Sequencing platforms,need to synthesize when sequencing,nanopore technology directly sequences native DNA and RNA molecules.Nanopore sequencing,therefore,avoids converting mRNA into cDNA molecules,which not only allows for the sequencing of extremely long native DNA and full-length RNA molecules but also document modifications that have been made to those native DNA or RNA bases.In this review on direct DNA sequencing and direct RNA sequencing using Oxford Nanopore technology,we focus on their development and application achievements,discussing their challenges and future perspective.We also address the problems researchers may encounter applying these approaches in their research topics,and how to resolve them.展开更多
Postzygotic mutations are acquired in normal tissues throughout an individual’s lifetime and hold clues for identifying mutagenic factors.Here,we investigated postzygotic mutation spectra of healthy individuals using...Postzygotic mutations are acquired in normal tissues throughout an individual’s lifetime and hold clues for identifying mutagenic factors.Here,we investigated postzygotic mutation spectra of healthy individuals using optimized ultra-deep exome sequencing of the time-series samples from the same volunteer as well as the samples from different individuals.In blood,sperm,and muscle cells,we resolved three common types of mutational signatures.Signatures A and B represent clocklike mutational processes,and the polymorphisms of epigenetic regulation genes influence the proportion of signature B in mutation profiles.Notably,signature C,characterized by C>T transitions at GpCpN sites,tends to be a feature of diverse normal tissues.Mutations of this type are likely to occur early during embryonic development,supported by their relatively high allelic frequencies,presence in multiple tissues,and decrease in occurrence with age.Almost none of the public datasets for tumors feature this signature,except for 19.6%of samples of clear cell renal cell carcinoma with increased activation of the hypoxia-inducible factor 1(HIF-1)signaling pathway.Moreover,the accumulation of signature C in the mutation profile was accelerated in a human embryonic stem cell line with drug-induced activation of HIF-1α.Thus,embryonic hypoxia may explain this novel signature across multiple normal tissues.Our study suggests that hypoxic condition in an early stage of embryonic development is a crucial factor inducing C>T transitions at GpCpN sites;and individuals’genetic background may also influence their postzygotic mutation profiles.展开更多
The chromosome 17q21.31 inversion is a 900-kb common structural polymorphism found primarily in European population. Although the genetic flux within inversion region was assumed to be considerable suppressed, it is s...The chromosome 17q21.31 inversion is a 900-kb common structural polymorphism found primarily in European population. Although the genetic flux within inversion region was assumed to be considerable suppressed, it is still unclear about the details of genetic exchange between the H1 (non-inverted sequence) and H2 (inverted sequence) haplotypes of this inversion. Here we describe a refmed map of genetic exchanges between pairs of gene arrangements within the 17q21.31 region. Using HapMap phase II data of 1,546 single nucleotide polymorphisms, we successfully deduced 96 H1 and 24 H2 haplotypes in European samples by neighbor-joining tree reconstruction. Furthermore, we identified 15 and 26 candidate tracts with reciprocal and non-reciprocal genetic exchanges, respectively. In all 15 regions harboring reciprocal exchange, haplotypes reconstructed by clone sequencing did not support these exchange events, suggesting that such signals of exchange between two sister chromosomes in certain heterozygous individual were caused by phasing error regions. On the other hand, the finished clone sequencing across 4 of 26 tracts with non-reciprocal genetic flux confirmed that this kind of genetic exchange was caused by gene conversion. In summary, as crossover between pairs of gene arrangements had been considerably suppressed, gene conversion might be the most important mechanism for genetic exchange at 17q21.31.展开更多
Population genomic approaches, which take advantages of high-throughput genotyping, are powerful yet costly methods to scan for selective sweeps. DNA-pooling strategies have been widely used for association studies be...Population genomic approaches, which take advantages of high-throughput genotyping, are powerful yet costly methods to scan for selective sweeps. DNA-pooling strategies have been widely used for association studies because it is a cost-effective alternative to large-scale individual genotyping. Here, we performed an SNP-MaP (single nucleotide polymorphism microarrays and pooling) analysis using samples from Eurasia to evaluate the efficiency of pooling strategy in genome-wide scans for selection. By conducting simulations of allelotype data, we first demonstrated that the boxplot with average heterozygosity (HET) is a promising method to detect strong selective sweeps with a moderate level of pooling error. Based on this, we used a sliding window analysis of HET to detect the large contiguous regions (LCRs) putatively under selective sweeps from Eurasia datasets. This survey identified 63 LCRs in a European population. These signals were further supported by the integrated haplotype score (iHS) test using HapMap II data. We also confirmed the European-specific signatures of positive selection from several previously identified genes (KEL, TRPV5, TRPV6, EPHB6). In summary, our results not only revealed the high credibility of SNP-MaP strategy in scanning for selective sweeps, but also provided an insight into the population differentiation.展开更多
Transmission distortion (TD) is a significant departure from Mendelian predictions of genes or chromosomes to offspring. While many biological processes have been implicated, there is still much to be understood abo...Transmission distortion (TD) is a significant departure from Mendelian predictions of genes or chromosomes to offspring. While many biological processes have been implicated, there is still much to be understood about TD in humans. Here we present our findings from a genome-wide scan for evidence of TD using haplotype data of 60 trio families from the International HapMap Project. Fisher's exact test was applied to assess the extent of TD in 629,958 SNPs across the autosomes. Based on the empirical distribution of PFisher and further permutation tests, we identified 1,205 outlier loci and 224 candidate genes with TD. Using the PANTHER gene ontology database, we found 19 categories of biological processes with an enrichment of candidate genes. In particular, the “protein phosphorylation” category contained the largest number of candidates in both HapMap samples. Further analysis uncovered an intriguing non-synonymous change in PPPIR12B, a gene related to protein phosphorylation, which appears to influence the allele transmission from male parents in the YRI (Yoruba from Ibadan, Nigeria) population. Our findings also indicate an ethnicity-related property of TD signatures in HapMap samples and provide new clues for our understanding of TD in humans.展开更多
For transcriptome analysis, it is critical to precisely define all the transcripts across the whole genome. More and more digital gene expression (DGE) scannings have indicated the presence of huge amount of novel t...For transcriptome analysis, it is critical to precisely define all the transcripts across the whole genome. More and more digital gene expression (DGE) scannings have indicated the presence of huge amount of novel transcripts in addition to the known gene models. However, almost all these studies still depend crucially on existing annotation. Here, we present Gene2DGE, a Perl software package for gene model renewal with DGE data. We applied Gene2DGE to the mouse blastomere transcriptome, and defined 98,532 read-enriched regions (RERs) by read clustering supported by more than four reads for each base pair. Taking advantage of this ab initio method, we refined 2,104 exonic regions (4% of a total of 48,501 annotated transcribed regions) with remarkable extension into un-annotated regions (〉50 bp). For 5% of uniquely mapped reads falling within intron regions, we identified 13,291 additional possible exons. As a result, we renewed 4,788 gene models, which account for 39% of a total of 12,277 transcribed genes. Furthermore, we identified 12,613 intergenic RERs, suggesting the possible presence of novel genes outside the existing gene models. In this study, therefore, we have developed a suitable tool for renewal of known gene models by ab initio prediction in transcriptome dissection. The Gene2DGE package is freely available at http://bighapmap.big.ac.cn/.展开更多
In recent years,more and more single-cell technologies have been developed.A vast amount of single-cell omics data has been generated by large projects,such as the Human Cell Atlas,the Mouse Cell Atlas,the Mouse RNA A...In recent years,more and more single-cell technologies have been developed.A vast amount of single-cell omics data has been generated by large projects,such as the Human Cell Atlas,the Mouse Cell Atlas,the Mouse RNA Atlas,the Mouse ATAC Atlas,and the Plant Cell Atlas.Based on these single-cell big data,thousands of bioinformatics algorithms for quality control,clustering,cell-type annotation,developmental inference,cell-cell transition,cell-cell interaction,and spatial analysis are developed.With powerful experimental single-cell technology and state-of-the-art big data analysis methods based on artificial intelligence,the molecular landscape at the single-cell level can be revealed.展开更多
文摘<div style="text-align:justify;"> In view of the complex problems that freight train ATO (automatic train operation) needs to comprehensively consider punctuality, energy saving and safety, a dynamics model of the freight train operation process is established based on the safety and the freight train dynamics model in the process of its operation. The algorithm of combining elite competition strategy with multi-objective particle swarm optimization technology is introduced, and the winning particles are obtained through the competition between two elite particles to guide the update of other particles, so as to balance the convergence and distribution of multi-objective particle swarm optimization. The performance comparison experimental results verify the superiority of the proposed algorithm. The simulation experiments of the actual line verify the feasibility of the model and the effectiveness of the proposed algorithm. </div>
文摘<div style="text-align:justify;"> Load identification method is one of the major technical difficulties of non-intrusive composite monitoring. Binary V-I trajectory image can reflect the original V-I trajectory characteristics to a large extent, so it is widely used in load identification. However, using single binary V-I trajectory feature for load identification has certain limitations. In order to improve the accuracy of load identification, the power feature is added on the basis of the binary V-I trajectory feature in this paper. We change the initial binary V-I trajectory into a new 3D feature by mapping the power feature to the third dimension. In order to reduce the impact of imbalance samples on load identification, the SVM SMOTE algorithm is used to balance the samples. Based on the deep learning method, the convolutional neural network model is used to extract the newly produced 3D feature to achieve load identification in this paper. The results indicate the new 3D feature has better observability and the proposed model has higher identification performance compared with other classification models on the public data set PLAID. </div>
基金supported by the National Natural Science Foundation of China(no.92049201).
文摘OSTEOKINES IN INTER-ORGAN COMMUNICATIONS The concept of“organic wholeness”permeates all the fields of traditional Chinese medicine,which is also widely accepted by modern medicine.The wealth of knowledge regarding inter-organ communications generated in the latest decades provided solid evidence that human physiology and pathophysiology involve systematic interactions between multiple organs or tissues.Therefore,advancing our understanding of the inter-organ communication process is believed to provide a new perspective for treating various human diseases.
基金This work was supported by the 111Project(Project No.:B13003)Innovation Promotion Association CAS(2016098)National Natural Science Foundation of China(81201700)to D.Z。
文摘Hepatitis B virus(HBV),one of the well-known DNA oncogenic viruses,is the leading cause of hepatocellular carcinoma(HCC).In infected hepatocytes,HBV DNA can be integrated into the host genome through an insertional mutagenesis process inducing tumorigenesis.Dissection of the genomic features surrounding integration sites will deepen our understanding of mechanisms underlying integration.Moreover,the quantity and biological activity of integration sites may reflect the DNA damage within affected cells or the potential survival benefits they may confer.The wellknown human genomic features include repeat elements,particular regions(such as telomeres),and frequently interrupted genes(e.g.,telomerase reverse transcriptase[i.e.TERT],lysine methyltransferase 2B[i.e.KMT2B],cyclin E1[CCNE1],and cyclin A2[CCNA2]).Consequently,distinct genomic features within diverse integrations differentiate their biological functions.Meanwhile,accumulating evidence has shown that viral proteins produced by integrants may cause cell damage even after the suppression of HBV replication.The integration-derived gene products can also serve as tumor markers,promoting the development of novel therapeutic strategies for HCC.Viral integrants can be single copy or multiple copies of different fragments with complicated rearrangement,which warrants elucidation of the whole viral integrant arrangement in future studies.All of these considerations underlie an urgent need to develop novel methodology and technology for sequence characterization and function evaluation of integration events in chronic hepatitis B-associated disease progression by monitoring both host genomic features and viral integrants.This endeavor may also serve as a promising solution for evaluating the risk of tumorigenesis and as a companion diagnostic for designing therapeutic strategies targeting integration-related disease complications.
基金supported by the CAS Key Laboratory of Pathogenic Microbiology and Immunology of China (Grant No. 2009CASPMI-007) to DZthe National Natural Science Foundation of China (Grant No. 81201700) to DZ
文摘The Streptococcus suis serotype 2(S. suis 2) isolates 05ZYH33 and 98HAH33 have caused severe human infections in China. Using a strand-specific RNA-seq analysis, we compared the in vitro transcriptomes of these two Chinese isolates with that of a reference strain(P1/7). In the89 K genomic island that is specific to these Chinese isolates, a toxin–antitoxin system showed relatively high levels of transcription among the S. suis. The known virulence factors with high transcriptional activity in these two highly-pathogenic strains are mainly involved in adhesion, biofilm formation, hemolysis and the synthesis and transport of the outer membrane protein. Furthermore,our analysis of novel transcripts identified over 50 protein-coding genes with one of them encoding a toxin protein. We also predicted over 30 small RNAs(s RNAs) in each strain, and most of them are involved in riboswitches. We found that six s RNA candidates that are related to bacterial virulence, including csp A and rli38, are specific to Chinese isolates. These results provide insight into the factors responsible for the difference in virulence among the different S. suis 2 isolates.
基金financially supported by the National Natural Science Foundation of China (51802208, 51920105005, 21902113, 51821002 and 91833303)the Natural Science Foundation of Jiangsu Province (BK20200101)the Collaborative Innovation Centre of Suzhou Nano Science & Technology, and the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD)the Natural Sciences and Engineering Council of Canada for support of this work
基金supported by the Key-Areas Research and Development Program of Guangdong Province(2020B020220004)the Youth Innovation Promotion Association,Chinese Academy of Sciences(2017399)+2 种基金the Science and Technology Program of Guangzhou(202002030097)the Hong Kong Research Grants Council Area of Excellence Scheme(AoE/M-403/16),the ECS(27204518)TRS of the HKSAR government(T21-705/20-N).
文摘The Human Genome Project opened an era of(epi)genomic research,and also provided a platform for the development of new sequencing technologies.During and after the project,several sequencing technologies continue to dominate nucleic acid sequencing markets.Currently,Illumina(short-read),PacBio(long-read),and Oxford Nanopore(longread)are the most popular sequencing technologies.Unlike PacBio or the popular short-read sequencers before it,which,as examples of the second or so-called Next-Generation Sequencing platforms,need to synthesize when sequencing,nanopore technology directly sequences native DNA and RNA molecules.Nanopore sequencing,therefore,avoids converting mRNA into cDNA molecules,which not only allows for the sequencing of extremely long native DNA and full-length RNA molecules but also document modifications that have been made to those native DNA or RNA bases.In this review on direct DNA sequencing and direct RNA sequencing using Oxford Nanopore technology,we focus on their development and application achievements,discussing their challenges and future perspective.We also address the problems researchers may encounter applying these approaches in their research topics,and how to resolve them.
基金supported by the grants from the Strategic Priority Research Program of Chinese Academy of Sciences(Grant No.XDB13020500)the National Natural Science Foundation of China(NSFC)(Grant Nos.91131905,31471199,and 91631304)+3 种基金the Key Research Program of Chinese Academy of Sciences(Grant No.KJZD-EW-L14 to CZ)the NSFC(Grant Nos.31440057 and 31701081 to WC)the 111 Project(Grant No.B13003 to WC and DZ)the Innovation Promotion Association of Chinese Academy of Sciences(Grant Nos.2016098 to DZ and 2019103 to AC)。
文摘Postzygotic mutations are acquired in normal tissues throughout an individual’s lifetime and hold clues for identifying mutagenic factors.Here,we investigated postzygotic mutation spectra of healthy individuals using optimized ultra-deep exome sequencing of the time-series samples from the same volunteer as well as the samples from different individuals.In blood,sperm,and muscle cells,we resolved three common types of mutational signatures.Signatures A and B represent clocklike mutational processes,and the polymorphisms of epigenetic regulation genes influence the proportion of signature B in mutation profiles.Notably,signature C,characterized by C>T transitions at GpCpN sites,tends to be a feature of diverse normal tissues.Mutations of this type are likely to occur early during embryonic development,supported by their relatively high allelic frequencies,presence in multiple tissues,and decrease in occurrence with age.Almost none of the public datasets for tumors feature this signature,except for 19.6%of samples of clear cell renal cell carcinoma with increased activation of the hypoxia-inducible factor 1(HIF-1)signaling pathway.Moreover,the accumulation of signature C in the mutation profile was accelerated in a human embryonic stem cell line with drug-induced activation of HIF-1α.Thus,embryonic hypoxia may explain this novel signature across multiple normal tissues.Our study suggests that hypoxic condition in an early stage of embryonic development is a crucial factor inducing C>T transitions at GpCpN sites;and individuals’genetic background may also influence their postzygotic mutation profiles.
基金supported by the National Natural Science Foundation of China(No.30871348,30700470,30890030 and 30890031)Educational Department of Jiangxi Province(No.GJJ10303)
文摘The chromosome 17q21.31 inversion is a 900-kb common structural polymorphism found primarily in European population. Although the genetic flux within inversion region was assumed to be considerable suppressed, it is still unclear about the details of genetic exchange between the H1 (non-inverted sequence) and H2 (inverted sequence) haplotypes of this inversion. Here we describe a refmed map of genetic exchanges between pairs of gene arrangements within the 17q21.31 region. Using HapMap phase II data of 1,546 single nucleotide polymorphisms, we successfully deduced 96 H1 and 24 H2 haplotypes in European samples by neighbor-joining tree reconstruction. Furthermore, we identified 15 and 26 candidate tracts with reciprocal and non-reciprocal genetic exchanges, respectively. In all 15 regions harboring reciprocal exchange, haplotypes reconstructed by clone sequencing did not support these exchange events, suggesting that such signals of exchange between two sister chromosomes in certain heterozygous individual were caused by phasing error regions. On the other hand, the finished clone sequencing across 4 of 26 tracts with non-reciprocal genetic flux confirmed that this kind of genetic exchange was caused by gene conversion. In summary, as crossover between pairs of gene arrangements had been considerably suppressed, gene conversion might be the most important mechanism for genetic exchange at 17q21.31.
基金supported by the National Natural Science Foundation of China (No. 30871348 and 30700470)Educational Department of Jiangxi Province (No. GJJ10303)National Key Laboratory Specific Fund (No. 2060204)
文摘Population genomic approaches, which take advantages of high-throughput genotyping, are powerful yet costly methods to scan for selective sweeps. DNA-pooling strategies have been widely used for association studies because it is a cost-effective alternative to large-scale individual genotyping. Here, we performed an SNP-MaP (single nucleotide polymorphism microarrays and pooling) analysis using samples from Eurasia to evaluate the efficiency of pooling strategy in genome-wide scans for selection. By conducting simulations of allelotype data, we first demonstrated that the boxplot with average heterozygosity (HET) is a promising method to detect strong selective sweeps with a moderate level of pooling error. Based on this, we used a sliding window analysis of HET to detect the large contiguous regions (LCRs) putatively under selective sweeps from Eurasia datasets. This survey identified 63 LCRs in a European population. These signals were further supported by the integrated haplotype score (iHS) test using HapMap II data. We also confirmed the European-specific signatures of positive selection from several previously identified genes (KEL, TRPV5, TRPV6, EPHB6). In summary, our results not only revealed the high credibility of SNP-MaP strategy in scanning for selective sweeps, but also provided an insight into the population differentiation.
基金supported by the National Nature Science Foundation of China (No.30225017)
文摘Transmission distortion (TD) is a significant departure from Mendelian predictions of genes or chromosomes to offspring. While many biological processes have been implicated, there is still much to be understood about TD in humans. Here we present our findings from a genome-wide scan for evidence of TD using haplotype data of 60 trio families from the International HapMap Project. Fisher's exact test was applied to assess the extent of TD in 629,958 SNPs across the autosomes. Based on the empirical distribution of PFisher and further permutation tests, we identified 1,205 outlier loci and 224 candidate genes with TD. Using the PANTHER gene ontology database, we found 19 categories of biological processes with an enrichment of candidate genes. In particular, the “protein phosphorylation” category contained the largest number of candidates in both HapMap samples. Further analysis uncovered an intriguing non-synonymous change in PPPIR12B, a gene related to protein phosphorylation, which appears to influence the allele transmission from male parents in the YRI (Yoruba from Ibadan, Nigeria) population. Our findings also indicate an ethnicity-related property of TD signatures in HapMap samples and provide new clues for our understanding of TD in humans.
基金supported by the National Nature Science Foundation of China (Grant No. 81171184, 31060139 and 30871384)Nature Science Foundation of Jiangxi Province (Grant No. 20114BAB215019)+1 种基金Department of Health of Jiangxi Province (Grant No. 20111209)Technology Pedestal and Society Development Project of Jiangxi Province (Grant No. 2010BSA09500 and 20111BBG70009-1)
文摘For transcriptome analysis, it is critical to precisely define all the transcripts across the whole genome. More and more digital gene expression (DGE) scannings have indicated the presence of huge amount of novel transcripts in addition to the known gene models. However, almost all these studies still depend crucially on existing annotation. Here, we present Gene2DGE, a Perl software package for gene model renewal with DGE data. We applied Gene2DGE to the mouse blastomere transcriptome, and defined 98,532 read-enriched regions (RERs) by read clustering supported by more than four reads for each base pair. Taking advantage of this ab initio method, we refined 2,104 exonic regions (4% of a total of 48,501 annotated transcribed regions) with remarkable extension into un-annotated regions (〉50 bp). For 5% of uniquely mapped reads falling within intron regions, we identified 13,291 additional possible exons. As a result, we renewed 4,788 gene models, which account for 39% of a total of 12,277 transcribed genes. Furthermore, we identified 12,613 intergenic RERs, suggesting the possible presence of novel genes outside the existing gene models. In this study, therefore, we have developed a suitable tool for renewal of known gene models by ab initio prediction in transcriptome dissection. The Gene2DGE package is freely available at http://bighapmap.big.ac.cn/.
基金supported by the Strategic Priority Research Program of Chinese Academy of Sciences(XDA26040304,XDB38050200)the National Natural Science Foundation of China(82102182,31961133010,31970805)+1 种基金Jinfeng Laboratory,Chongqing,China(jfkyjf202203001)The Youth Innovation Promotion Association of Chinese Academy of Sciences(2017139).
文摘In recent years,more and more single-cell technologies have been developed.A vast amount of single-cell omics data has been generated by large projects,such as the Human Cell Atlas,the Mouse Cell Atlas,the Mouse RNA Atlas,the Mouse ATAC Atlas,and the Plant Cell Atlas.Based on these single-cell big data,thousands of bioinformatics algorithms for quality control,clustering,cell-type annotation,developmental inference,cell-cell transition,cell-cell interaction,and spatial analysis are developed.With powerful experimental single-cell technology and state-of-the-art big data analysis methods based on artificial intelligence,the molecular landscape at the single-cell level can be revealed.