Soybean seed isoflavones are a type of secondary metabolites that can provide health and nutrition benefits for humans. In our previous study, a stable quantitative trait locus(QTL) qIF05-1 controlling the seed isofla...Soybean seed isoflavones are a type of secondary metabolites that can provide health and nutrition benefits for humans. In our previous study, a stable quantitative trait locus(QTL) qIF05-1 controlling the seed isoflavone content in soybean was detected on chromosome(Chr.) 05 in a recombinant inbred line(RIL) population from a cross of Huachun 2×Wayao. In this study, the parental lines were re-sequenced using the Illumina Solexa System with deep coverage. A total of 63,099 polymorphic long insertions and deletions(InDels)(≥15 bp)were identified between the parents Huachun 2 and Wayao. The InDels were unevenly distributed on 20chromosomes of soybean, varying from 1,826 in Chr. 12 to 4,544 in Chr. 18. A total of 10,002 long InDels(15.85% of total) were located in genic regions, including 1,139 large-effect long InDels which resulted in truncated or elongated protein sequences. In the qIF05-1 region, 68 long InDels were detected between the two parents. Using a progeny recombination experiment and genotype analysis, the qIF05-1 locus was mapped into a 102.2 kb genomic region, and this region contained 12 genes. By RNA-seq data analysis, genome sequence comparison and functional validation through ectopic expression in Arabidopsis thaliana, Glyma.05G208300(described as GmEGL3), which is a basic helix-loop-helix(bHLH) transcription factor in plants, emerged as the most likely confirmed gene in qIF05-1. These long InDels can be used as a type of complementary genetic method for QTL fine mapping, and they can facilitate genetic studies and molecular-assisted selection breeding in soybean.展开更多
Introduction: Omicron is a highly divergent variant of concern (VOCs) of a severe acute respiratory syndrome SARS-CoV-2. It carries a high number of mutations in its spike protein hence;it is more transmissible in the...Introduction: Omicron is a highly divergent variant of concern (VOCs) of a severe acute respiratory syndrome SARS-CoV-2. It carries a high number of mutations in its spike protein hence;it is more transmissible in the community by immune evasion mechanisms. Due to mutation within S gene, most Omicron variants have reported S gene target failure (SGTF) with some commercially available PCR kits. Such diagnostic features can be used as markers to screen Omicron. However, Whole Genome Sequencing (WGS) is the only gold standard approach to confirm novel microorganisms at genetically level as similar mutations can also be found in other variants that are circulating at low frequencies worldwide. This Retrospective study is aimed to assess RT-PCR sensitivity in the detection of S gene target failure in comparison with whole genome sequencing to detect variants of Omicron. Methods: We have analysed retrospective data of SARS-CoV-2 positive RT-PCR samples for S gene target failure (SGTF) with TaqPath COVID-19 RT-PCR Combo Kit (ThermoFisher) and combined with sequencing technologies to study the emerged pattern of SARS-CoV-2 variants during third wave at the tertiary care centre, Surat. Results: From the first day of December 2021 till the end of February 2022, a total of 321,803 diagnostic RT-PCR tests for SARS-CoV-2 were performed, of which 20,566 positive cases were reported at our tertiary care centre with an average cumulative positivity of 6.39% over a period of three months. In the month of December 21 samples characterized by the SGTF (70/129) were suggestive of being infected by the Omicron variant and identified as Omicron (B.1.1.529 lineage) when sequence. In the month of January, we analysed a subset of samples (n = 618) with SGTF (24%) and without SGTF (76%) with Ct values Conclusions: During the COVID-19 pandemic, it took almost more than 15 days to diagnose infection and identify pathogen by sequencing technology. In contrast to that molecular assay provided quick identification with the help of SGTF phenomenon within 5 hours of duration. This strategy helps scientists and health policymakers for the quick isolation and identification of clusters. That ultimately results in a decreased transmission of pathogen among the community.展开更多
Background Breed identification is useful in a variety of biological contexts.Breed identification usually involves two stages,i.e.,detection of breed-informative SNPs and breed assignment.For both stages,there are se...Background Breed identification is useful in a variety of biological contexts.Breed identification usually involves two stages,i.e.,detection of breed-informative SNPs and breed assignment.For both stages,there are several methods proposed.However,what is the optimal combination of these methods remain unclear.In this study,using the whole genome sequence data available for 13 cattle breeds from Run 8 of the 1,000 Bull Genomes Project,we compared the combinations of three methods(Delta,FST,and In)for breed-informative SNP detection and five machine learning methods(KNN,SVM,RF,NB,and ANN)for breed assignment with respect to different reference population sizes and difference numbers of most breed-informative SNPs.In addition,we evaluated the accuracy of breed identification using SNP chip data of different densities.Results We found that all combinations performed quite well with identification accuracies over 95%in all scenarios.However,there was no combination which performed the best and robust across all scenarios.We proposed to inte-grate the three breed-informative detection methods,named DFI,and integrate the three machine learning methods,KNN,SVM,and RF,named KSR.We found that the combination of these two integrated methods outperformed the other combinations with accuracies over 99%in most cases and was very robust in all scenarios.The accuracies from using SNP chip data were only slightly lower than that from using sequence data in most cases.Conclusions The current study showed that the combination of DFI and KSR was the optimal strategy.Using sequence data resulted in higher accuracies than using chip data in most cases.However,the differences were gener-ally small.In view of the cost of genotyping,using chip data is also a good option for breed identification.展开更多
Objective:Histology grade,subtypes and TNM stage of lung adenocarcinomas are useful predictors of prognosis and survival.The aim of the study was to investigate the relationship between chromosomal instability,morphol...Objective:Histology grade,subtypes and TNM stage of lung adenocarcinomas are useful predictors of prognosis and survival.The aim of the study was to investigate the relationship between chromosomal instability,morphological subtypes and the grading system used in lung non-mucinous adenocarcinoma(LNMA).Methods:We developed a whole genome copy number variation(WGCNV)scoring system and applied next generation sequencing to evaluate CNVs present in 91 LNMA tumor samples.Results:Higher histological grades,aggressive subtypes and more advanced TNM staging were associated with an increased WGCNV score,particularly in CNV regions enriched for tumor suppressor genes and oncogenes.In addition,we demonstrate that 24-chromosome CNV profiling can be performed reliably from specific cell types(<100 cells)isolated by sample laser capture microdissection.Conclusions:Our findings suggest that the WGCNV scoring system we developed may have potential value as an adjunct test for predicting the prognosis of patients diagnosed with LNMA.展开更多
Background Pork quality can directly affect customer purchase tendency and meat quality traits have become valu-able in modern pork production.However,genetic improvement has been slow due to high phenotyping costs.In...Background Pork quality can directly affect customer purchase tendency and meat quality traits have become valu-able in modern pork production.However,genetic improvement has been slow due to high phenotyping costs.In this study,whole genome sequence(WGS)data was used to evaluate the prediction accuracy of genomic best linear unbiased prediction(GBLUP)for meat quality in large-scale crossbred commercial pigs.Results We produced WGS data(18,695,907 SNPs and 2,106,902 INDELs exceed quality control)from 1,469 sequenced Duroc×(Landrace×Yorkshire)pigs and developed a reference panel for meat quality including meat color score,marbling score,L*(lightness),a*(redness),and b*(yellowness)of genomic prediction.The prediction accuracy was defined as the Pearson correlation coefficient between adjusted phenotypes and genomic estimated breeding values in the validation population.Using different marker density panels derived from WGS data,accuracy differed substantially among meat quality traits,varied from 0.08 to 0.47.Results showed that MultiBLUP outperform GBLUP and yielded accuracy increases ranging from 17.39%to 75%.We optimized the marker density and found medium-and high-density marker panels are beneficial for the estimation of heritability for meat quality.Moreover,we conducted genotype imputation from 50K chip to WGS level in the same population and found average concord-ance rate to exceed 95%and r^(2)=0.81.Conclusions Overall,estimation of heritability for meat quality traits can benefit from the use of WGS data.This study showed the superiority of using WGS data to genetically improve pork quality in genomic prediction.展开更多
BACKGROUND: Natural cerebrolysin (NC), a Chinese herbal drug for the treatment of Alzheimer's disease (AD), induces mesenchymal stem cell (MSC) differentiation into neuron-like cells, with low toxicity. But th...BACKGROUND: Natural cerebrolysin (NC), a Chinese herbal drug for the treatment of Alzheimer's disease (AD), induces mesenchymal stem cell (MSC) differentiation into neuron-like cells, with low toxicity. But the mechanisms involved in NC effects on MSCs remain poorly understood. OBJECTIVE: We used a whole genome microarray technique to further investigate the molecular, genetic, and pharmacodynamic mechanisms of NC on MSC gene expression profiles. DESIGN, TIME AND SETTING: A parallel, controlled, in vitro experiment was performed at the First Affiliated Hospital of Shenzhen University, Shenzhen Institute of Integrated Chinese and Western Medicine, China, between September 2006 and October 2008. MATERIALS: NC was provided by Shenzhen Institute of Integrated Chinese and Western Medicine China. It was predominantly composed of Renshen (Radix Ginseng), Tianma (Rhizoma Gastrodiae) and Yinxingye (Ginkgo Leaf) and prepared by conventional water extractJon technology. Twelve adult, male, New Zealand rabbits were included, six of which underwent intragastric administration of NC extract for 1 month to create NC-containing serum. METHODS: Bone marrow was collected from the tibia and femur of Sprague Dawley rats, aged 6 8 months old. Rat MSCs were isolated and purified by the whole bone marrow adherence method. After in vitro culture, MSCs from passage 4 were treated with NC-containing serum for 48 hours, and total RNA was extracted. Gene expression in MSCs was analyzed using Affymetrix whole genome microarray analysis. MAIN OUTCOME MEASURES: Differentially expressed genes in NC serum-treated MSCs. RESULTS: NC treated MSCs displayed 46 differentially expressed genes, 22 with upregulated expression (fold change 〉 2) and 24 with downregulated expression (fold change 〈 -2). Differentially expressed genes participated in neuronal growth, differentiation, and function, cell growth, differentiation, proliferation, apoptosis, signal transduction, substance/energy metabolism, ion transport, and immune responses. NC treatment changed levels of transforming growth factor β/ bone morphogenetic proteins, Hedgehog, Bmp, and Wntsignaling pathways, which regulate nerve cell differentiation, development and function, as well as learning and memory; Ras, G protein- coupled receptor signal pathways that are related to cell growth, proliferation, and apoptosis; and mitogen-activated protein kinase kinase kinase signaling cascades. CONCLUSION: NC can regulate gene expression for many signal transduction pathways related to nerve cell differentiation, development and function, learning and memory function, as well as regulation of cell growth, differentiation, proliferation, or apoptosis to mediate the genetic effects of NC treatment on AD.展开更多
With the development of sequencing technology, insertions-deletions(InDels) have been increasingly reported to be involved in the genetic determination of agronomical traits. However, most studies have focused on the ...With the development of sequencing technology, insertions-deletions(InDels) have been increasingly reported to be involved in the genetic determination of agronomical traits. However, most studies have focused on the identification and application of short-InDels(1–15 bp) for genetic analysis. The objective of this study was to deeply deploy long-InDels(>15 bp) for the genetic analysis of important agronomic traits in soybean. A total of 13 573 polymorphic long-InDels were identified between parents Zhongpin 03-5373(ZP) and Zhonghuang 13(ZH), which were unevenly distributed on 20 chromosomes of soybean, varying from 321 in Chr11 to 1 246 in Chr18. Consistent with the distribution pattern of annotated genes, the average density of long-InDels in arm regions was significantly higher than that in pericentromeric regions at the P=0.01 level. A total of 2 704(19.9% of total) long-InDels were located in genic regions, including 319 large-effect long-InDels, which resulted in truncated or elongated protein sequences. A previously identified QTL(qP H16) underlying plant height was further analyzed, and it was found that 26 out of 35(74.3%) long-InDel markers located in the qPH16 region showed clear polymorphisms between parents ZP and ZH. Seven markers, including three long-InDels and four previously reported SNP markers, were used to genotype 242 recombinant inbred lines derived from ZP×ZH. As a result, the qPH16 locus was narrowed from a 960-kb region to a 477.55-kb region, containing 65 annotated genes. Therefore, these long-InDels are a complementary genetic resource of SNPs and short-InDels for plant height and can facilitate genetic studies and molecular assisted selection breeding in soybean.展开更多
Objective To evaluate a single-reaction genome amplification method, the multisegment reverse transcription-PCR (M-RTPCR), for its sensitivity to full genome sequencing of influenza A virus, and the ability to diffe...Objective To evaluate a single-reaction genome amplification method, the multisegment reverse transcription-PCR (M-RTPCR), for its sensitivity to full genome sequencing of influenza A virus, and the ability to differentiate mix-subtype virus, using the next generation sequencing (NGS) platform. Methods Virus genome copy was quantified and seria(iy diluted to different titers, followed by amplification with the M-RTPCR method and sequencing on the NGS platform. Furthermore, we manually mixed two subtype viruses to different titer rate and amplified the mixed virus with the M-RTPCR protocol, followed by whole genome sequencing on the NGS platform. We also used clinical samples to test the method performance. Results The M-RTPCR method obtained complete genome of testing virus at 125 copies/reaction and determined the virus subtype at titer of 25 copies/reaction. Moreover, the two subtypes in the mixed virus could be discriminated, even though these two virus copies differed by 200-fold using this amplification protocol. The sensitivity of this protocol we detected using virus RNA was also confirmed with clinical samples containing Iow-titer virus. Conclusion The M-RTPCR is a robust and sensitive amplification method for whole genome sequencing of influenza A virus using NGS platform.展开更多
[Objective] The aim of this paper was to study the genotype, pathogenicity and nucleotide difference between Newcastle disease virus(NDV) isolates and traditional NDV vaccine strain(La Sota). [Method] A suspected NDV ...[Objective] The aim of this paper was to study the genotype, pathogenicity and nucleotide difference between Newcastle disease virus(NDV) isolates and traditional NDV vaccine strain(La Sota). [Method] A suspected NDV strain was isolated from a chicken farm. The isolate was preliminarily determined by HA and HI tests. A pair of primers was designed based on the partial sequence of NDV F gene published in GenBank(accession No. JF950510.1). F gene was amplified by RT-PCR, cloned and sequenced. The sequencing result was compared with the F gene sequences published in GenBank, and the phylogenetic tree was constructed to analyze the genotypes. The pathogenicity of the virus was determined by mean death time(MDT) of chicken embryos, intracerebral pathogenicity index(ICPI) of one-day-old chicks and intravenous pathogenicity index(IVPI) of six-week-old chickens, respectively. Based on the NDV genome sequence published in GenBank(accession No. JF950510.1), nine pairs of primers were designed to amplify the genome sequence of the isolate, and its structure was analyzed. [Result] The length of F gene was about 500 bp, and a NDV strain of genotype VII was isolated. The MDT, ICPI and IVPI were 52.8 h, 1.675 and 2.46, respectively, indicating the isolate was a virulent strain. The whole genome sequence analysis results showed that the full genome length of the isolate was 15 192 bp, which had 6 more nu-cleotides than that of La Sota strain, and the homology between the two strains was 82.8%. [Conclusion] A virulent NDV strain of genotype Ⅶ was isolated, with low homology to La Sota strain in nucleotide sequence.展开更多
Wild soybean resources, which are progenitor of cultivated soybean with selected agronomic characters, have rich genetic diversity. Here we used genome re-sequencing technology to analyze genetic variations between th...Wild soybean resources, which are progenitor of cultivated soybean with selected agronomic characters, have rich genetic diversity. Here we used genome re-sequencing technology to analyze genetic variations between the wild soybean 'ED059’ and cultivar 'Tianlong 2'. In genome level, 3,214,319 and 1,519,765 single nucleotide polymorphisms (SNPs), 553,141 and 314,430 insertion/deletion polymorphisms (InDels), and 471,063 and 334,412 structural variations (SVs) were identified between 'ED0595' and 'Tianlong 2' respec-tively based on soybean (Glycine max L. Merr) reference genome. Base on gene annotation of reference genome, 68,830 (2.14%) and 34,570 (2.27%) non-synonymous SNPs, 8,478 and 4,826 frameshift substitution were detected in CDS regions of 'ED0595' and 'Tianlong 2'. 'ED059’ harbored much more specific genetic variations of jasmonic acid (JA), salicylic acid (SA) and ethylene (ET) biosynthesis and signal pathway genes than those in 'Tianlong 2' indicating its unique strong insect defense activity. This work provides important information allowing better understanding of the soybean genome and being helpful for dissecting the genetic basis of important traits such as insect defense in soybean.展开更多
Gene duplications provide evolutionary potentials for generating novel functions, while polyploidization or whole genome duplication (WGD) doubles the chromosomes initially and results in hundreds to thousands of re...Gene duplications provide evolutionary potentials for generating novel functions, while polyploidization or whole genome duplication (WGD) doubles the chromosomes initially and results in hundreds to thousands of retained duplicates. WGDs are strongly supported by evidence commonly found in many species-rich lineages of eukaryotes, and thus are considered as a major driving force in species diversification. We per- formed comparative genomic and phylogenomic analyses of 59 public genomes/transcriptomes and 46 newly sequenced transcriptomes covering major lineages of angiosperms to detect large-scale gene dupli- cation events by surveying tens of thousands of gene family trees. These analyses confirmed most of the previously reported WGDs and provided strong evidence for novel ones in many lineages. The detected WGDs supported a model of exponential gene loss during evolution with an estimated half-life of approx- imately 21.6 million years, and were correlated with both the emergence of lineages with high degrees of diversification and periods of global climate changes. The new datasets and analyses detected many novel WGDs widely spread during angiosperm evolution, uncovered preferential retention of gene functions in essential cellular metabolisms, and provided clues for the roles of WGD in promoting angiosperm radiation and enhancing their adaptation to environmental changes.展开更多
Reliable and accurate pre-implantation genetic diagnosis (PGD) of patient's embryos by next-generation sequencing (NGS) is dependent on efficient whole genome amplification (WGA) of a representative biopsy samp...Reliable and accurate pre-implantation genetic diagnosis (PGD) of patient's embryos by next-generation sequencing (NGS) is dependent on efficient whole genome amplification (WGA) of a representative biopsy sample. However, the performance of the current state of the art WGA methods has not been evaluated for sequencing. Using low template DNA (15 pg) and single cells, we showed that the two PCR-based WGA systems SurePlex and MALBAC are superior to the REPLI-g WGA multiple displacement amplification (MDA) system in terms of consistent and reproducible genome coverage and sequence bias across the 24 chromosomes, allowing better normalization of test to reference sequencing data. When copy number variation sequencing (CNV-Seq) was applied to single cell WGA products derived by either SurePlex or MALBAC amplification, we showed that known disease CNVs in the range of 3-15 Mb could be reliably and accurately detected at the correct genomic positions. These findings indicate that our CNV-Seq pipeline incorporating either SurePlex or MALBAC as the key initial WGA step is a powerful methodology for clinical PGD to identify euploid embryos in a patient's cohort for uterine transplantation,展开更多
Gene loss following whole genome duplication (WGD) is often biased, with one subgenome retaining more ancestral genes and the other sustaining more gene deletions. While bias toward the greater expression of gene co...Gene loss following whole genome duplication (WGD) is often biased, with one subgenome retaining more ancestral genes and the other sustaining more gene deletions. While bias toward the greater expression of gene copies on one subgenome can explain bias in gene loss, this raises the question to what drives differences in gene expression levels between subgenomes. Differences in chromatin modifications and epigenetic markers between subgenomes in several model species are now being identified, providing an explanation for bias in gene expression between subgenomes. WGDs can be classified into duplications with higher, biased gene loss and bias in gene expression between subgenomes versus those with lower, unbiased rates of gene loss and an absence of detectable bias between subgenomes; however, the origi- nally proposed link between these two classes and whether WGD results from an allo- or autopolyploid event is inconsistent with recent data from the allopolyploid Capsella bursa-pastoris. The gene balance hypothesis can explain bias in the functional categories of genes retained following WGD, the difference in gene loss rates between unbiased and biased WGDs, and how plant genomes have avoided being overrun with genes encoding dose-sensitive subunits of multiprotein complexes. Comparisons of gene expression patterns between retained transcription factor pairs in maize suggest the high degree of retention for WGD-derived pairs of transcription factors may instead be explained by the older duplication-degeneration-complementation model.展开更多
Sacred lotus(Nelumbo nucifera or lotus) is an important aquatic plant in horticulture and ecosystems. As a foundation for exploring genomic variation and evolution among different germplasms, we re-sequenced 19 indivi...Sacred lotus(Nelumbo nucifera or lotus) is an important aquatic plant in horticulture and ecosystems. As a foundation for exploring genomic variation and evolution among different germplasms, we re-sequenced 19 individuals from three cultivated temperate lotus subgroups(rhizome,seed and flower lotus), one wild temperate lotus subgroup(wild lotus), one tropical lotus group(Thai lotus) and an outgroup(Nelumbo lutea). Through genetic diversity and polymorphism analysis by non-missing SNP sites widely distributed in the whole genome, we confirmed that wild and Thai lotus exhibited greater differentiation with a higher genomic diversity compared to cultivated lotus. Rhizome lotus had the lowest genomic diversity and a closer relationship to wild lotus, whereas the genomes of seed and flower lotus were admixed. Genes in energy metabolism process and plant immunity evolved rapidly in lotus, reflecting local adaptation.We established that candidate genes in genomic regions with significant differentiation associated with temperate and tropical lotus divergence always exhibited highly divergent expression pattern. Together, this study comprehensive and credible interpretates important patterns of genetic diversity and relationships, gene evolution, and genomic signature from ecotypic differentiation of sacred lotus.展开更多
Preimplantation genetic diagnosis (PGD) refers to a procedure for genetically analyzing embryos prior to implantation,improving the chance of conception for patients at high risk of transmitting specific inherited dis...Preimplantation genetic diagnosis (PGD) refers to a procedure for genetically analyzing embryos prior to implantation,improving the chance of conception for patients at high risk of transmitting specific inherited disorders.This method has been widely used for a large number of genetic disorders since the first successful application in the early 1990s.Polymerase chain reaction (PCR) and fluorescent in situ hybridization (FISH) are the two main methods in PGD,but there are some inevitable shortcomings limiting the scope of genetic diagnosis.Fortunately,different whole genome amplification (WGA) techniques have been developed to overcome these problems.Sufficient DNA can be amplified and multiple tasks which need abundant DNA can be performed.Moreover,WGA products can be analyzed as a template for multi-loci and multi-gene during the subsequent DNA analysis.In this review,we will focus on the currently available WGA techniques and their applications,as well as the new technical trends from WGA products.展开更多
An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical...An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 com-plex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.展开更多
Restriction endonuclease analysis(REA),or restriction fragment length polymorphism(RFLP),was useful for identifying and determining the relatedness and putative identities of microbial strains(Tang et al.,1997)and for...Restriction endonuclease analysis(REA),or restriction fragment length polymorphism(RFLP),was useful for identifying and determining the relatedness and putative identities of microbial strains(Tang et al.,1997)and for characterizing and discriminating large numbers of samples inexpensively in the past。展开更多
Lycophytes are an ancient clade of the non-flowering vascular plants with chromosome numbers that vary from tens to hundreds.They are an excellent study system for examining whole-genome duplications(WGDs),or polyploi...Lycophytes are an ancient clade of the non-flowering vascular plants with chromosome numbers that vary from tens to hundreds.They are an excellent study system for examining whole-genome duplications(WGDs),or polyploidization,in spore-dispersed vascular plants.However,a lack of genome sequence data limits the reliable detection of very ancient WGDs,small-scale duplications(SSDs),and recent WGDs.Here,we integrated phylogenomic analysis and the distribution of synonymous substitutions per synonymous sites(Ks)of the transcriptomes of 13 species of lycophytes to identify,locate,and date multiple WGDs in the lycophyte family Lycopodiaceae.Additionally,we examined the genus Phlegmariurus for signs of genetic discordance,which can provide valuable insight into the underlying causes of such conflict(e.g.,hybridization,incomplete lineage sorting,or horizontal gene transfer).We found strong evidence that two WGD events occurred along the phylogenetic backbone of Lycopodiaceae,with one occurring in the common ancestor of extant Phlegmariurus(Lycopodiaceae)approximately 22-23 million years ago(Mya)and the other occurring in the common ancestor of Lycopodiaceae around 206-214 Mya.Interestingly,we found significant genetic discordance in the genus Phlegmariurus,indicating that the genus has a complex evolutionary history.This study provides molecular evidence for multiple WGDs in Lycopodiaceae and offers phylogenetic clues to the evolutionary history of Lycopodiaceae.展开更多
An efficient rule-based algorithm is presented for haplotype inference from general pedigree genotype data, with the assumption of no recombination. This algorithm generalizes previous algorithms to handle the cases w...An efficient rule-based algorithm is presented for haplotype inference from general pedigree genotype data, with the assumption of no recombination. This algorithm generalizes previous algorithms to handle the cases where some pedigree founders are not genotyped, provided that for each nuclear family at least one parent is genotyped and each non-genotyped founder appears in exactly one nuclear family. The importance of this generalization lies in that such cases frequently happen in real data, because some founders may have passed away and their genotype data can no longer be collected. The algorithm runs in O(m^3n^3) time, where m is the number of single nucleotide polymorphism (SNP) loci under consideration and n is the number of genotyped members in the pedigree. This zero-recombination haplotyping algorithm is extended to a maximum parsimoniously haplotyping algorithm in one whole genome scan to minimize the total number of breakpoint sites, or equivalently, the number of maximal zero-recombination chromosomal regions. We show that such a whole genome scan haplotyping algorithm can be implemented in O(m^3n^3) time in a novel incremental fashion, here m denotes the total number of SNP loci along the chromosome.展开更多
基金supported by the China Agriculture Research System of MOF and MARA(CARS-04-PS12)the Research and Development Program in the Key-Areas of Guangdong Province,China(2020B020220008)the Guangdong Agricultural Research System,China(2023KJ136-03).
文摘Soybean seed isoflavones are a type of secondary metabolites that can provide health and nutrition benefits for humans. In our previous study, a stable quantitative trait locus(QTL) qIF05-1 controlling the seed isoflavone content in soybean was detected on chromosome(Chr.) 05 in a recombinant inbred line(RIL) population from a cross of Huachun 2×Wayao. In this study, the parental lines were re-sequenced using the Illumina Solexa System with deep coverage. A total of 63,099 polymorphic long insertions and deletions(InDels)(≥15 bp)were identified between the parents Huachun 2 and Wayao. The InDels were unevenly distributed on 20chromosomes of soybean, varying from 1,826 in Chr. 12 to 4,544 in Chr. 18. A total of 10,002 long InDels(15.85% of total) were located in genic regions, including 1,139 large-effect long InDels which resulted in truncated or elongated protein sequences. In the qIF05-1 region, 68 long InDels were detected between the two parents. Using a progeny recombination experiment and genotype analysis, the qIF05-1 locus was mapped into a 102.2 kb genomic region, and this region contained 12 genes. By RNA-seq data analysis, genome sequence comparison and functional validation through ectopic expression in Arabidopsis thaliana, Glyma.05G208300(described as GmEGL3), which is a basic helix-loop-helix(bHLH) transcription factor in plants, emerged as the most likely confirmed gene in qIF05-1. These long InDels can be used as a type of complementary genetic method for QTL fine mapping, and they can facilitate genetic studies and molecular-assisted selection breeding in soybean.
文摘Introduction: Omicron is a highly divergent variant of concern (VOCs) of a severe acute respiratory syndrome SARS-CoV-2. It carries a high number of mutations in its spike protein hence;it is more transmissible in the community by immune evasion mechanisms. Due to mutation within S gene, most Omicron variants have reported S gene target failure (SGTF) with some commercially available PCR kits. Such diagnostic features can be used as markers to screen Omicron. However, Whole Genome Sequencing (WGS) is the only gold standard approach to confirm novel microorganisms at genetically level as similar mutations can also be found in other variants that are circulating at low frequencies worldwide. This Retrospective study is aimed to assess RT-PCR sensitivity in the detection of S gene target failure in comparison with whole genome sequencing to detect variants of Omicron. Methods: We have analysed retrospective data of SARS-CoV-2 positive RT-PCR samples for S gene target failure (SGTF) with TaqPath COVID-19 RT-PCR Combo Kit (ThermoFisher) and combined with sequencing technologies to study the emerged pattern of SARS-CoV-2 variants during third wave at the tertiary care centre, Surat. Results: From the first day of December 2021 till the end of February 2022, a total of 321,803 diagnostic RT-PCR tests for SARS-CoV-2 were performed, of which 20,566 positive cases were reported at our tertiary care centre with an average cumulative positivity of 6.39% over a period of three months. In the month of December 21 samples characterized by the SGTF (70/129) were suggestive of being infected by the Omicron variant and identified as Omicron (B.1.1.529 lineage) when sequence. In the month of January, we analysed a subset of samples (n = 618) with SGTF (24%) and without SGTF (76%) with Ct values Conclusions: During the COVID-19 pandemic, it took almost more than 15 days to diagnose infection and identify pathogen by sequencing technology. In contrast to that molecular assay provided quick identification with the help of SGTF phenomenon within 5 hours of duration. This strategy helps scientists and health policymakers for the quick isolation and identification of clusters. That ultimately results in a decreased transmission of pathogen among the community.
基金funded by National Key Research and Development Program of China(2021YFD1200404)the Yangzhou University Interdisciplinary Research Foundation for Animal Science Discipline of Targeted Support(yzuxk202016)the Project of Genetic Improvement for Agricultural Species(Dairy Cattle)of Shandong Province(2019LZGC011).
文摘Background Breed identification is useful in a variety of biological contexts.Breed identification usually involves two stages,i.e.,detection of breed-informative SNPs and breed assignment.For both stages,there are several methods proposed.However,what is the optimal combination of these methods remain unclear.In this study,using the whole genome sequence data available for 13 cattle breeds from Run 8 of the 1,000 Bull Genomes Project,we compared the combinations of three methods(Delta,FST,and In)for breed-informative SNP detection and five machine learning methods(KNN,SVM,RF,NB,and ANN)for breed assignment with respect to different reference population sizes and difference numbers of most breed-informative SNPs.In addition,we evaluated the accuracy of breed identification using SNP chip data of different densities.Results We found that all combinations performed quite well with identification accuracies over 95%in all scenarios.However,there was no combination which performed the best and robust across all scenarios.We proposed to inte-grate the three breed-informative detection methods,named DFI,and integrate the three machine learning methods,KNN,SVM,and RF,named KSR.We found that the combination of these two integrated methods outperformed the other combinations with accuracies over 99%in most cases and was very robust in all scenarios.The accuracies from using SNP chip data were only slightly lower than that from using sequence data in most cases.Conclusions The current study showed that the combination of DFI and KSR was the optimal strategy.Using sequence data resulted in higher accuracies than using chip data in most cases.However,the differences were gener-ally small.In view of the cost of genotyping,using chip data is also a good option for breed identification.
基金grants from Beijing Hospital Key Research Program(121 Research Program,No.BJ2019-195)。
文摘Objective:Histology grade,subtypes and TNM stage of lung adenocarcinomas are useful predictors of prognosis and survival.The aim of the study was to investigate the relationship between chromosomal instability,morphological subtypes and the grading system used in lung non-mucinous adenocarcinoma(LNMA).Methods:We developed a whole genome copy number variation(WGCNV)scoring system and applied next generation sequencing to evaluate CNVs present in 91 LNMA tumor samples.Results:Higher histological grades,aggressive subtypes and more advanced TNM staging were associated with an increased WGCNV score,particularly in CNV regions enriched for tumor suppressor genes and oncogenes.In addition,we demonstrate that 24-chromosome CNV profiling can be performed reliably from specific cell types(<100 cells)isolated by sample laser capture microdissection.Conclusions:Our findings suggest that the WGCNV scoring system we developed may have potential value as an adjunct test for predicting the prognosis of patients diagnosed with LNMA.
基金supported by a Technical Innovation of Crossbred in Swine and Breed High Fertility Lines Project(2022B0202090002)a Local Innovative and Research Teams Project of Guangdong Province(2019BT02N630)+1 种基金a Natural Science Foundation of Guangdong Province project(2018B030313011)Innovative Teams of Modern Agriculture and Industry Technology System of Guangdong Province(2022KJ26).
文摘Background Pork quality can directly affect customer purchase tendency and meat quality traits have become valu-able in modern pork production.However,genetic improvement has been slow due to high phenotyping costs.In this study,whole genome sequence(WGS)data was used to evaluate the prediction accuracy of genomic best linear unbiased prediction(GBLUP)for meat quality in large-scale crossbred commercial pigs.Results We produced WGS data(18,695,907 SNPs and 2,106,902 INDELs exceed quality control)from 1,469 sequenced Duroc×(Landrace×Yorkshire)pigs and developed a reference panel for meat quality including meat color score,marbling score,L*(lightness),a*(redness),and b*(yellowness)of genomic prediction.The prediction accuracy was defined as the Pearson correlation coefficient between adjusted phenotypes and genomic estimated breeding values in the validation population.Using different marker density panels derived from WGS data,accuracy differed substantially among meat quality traits,varied from 0.08 to 0.47.Results showed that MultiBLUP outperform GBLUP and yielded accuracy increases ranging from 17.39%to 75%.We optimized the marker density and found medium-and high-density marker panels are beneficial for the estimation of heritability for meat quality.Moreover,we conducted genotype imputation from 50K chip to WGS level in the same population and found average concord-ance rate to exceed 95%and r^(2)=0.81.Conclusions Overall,estimation of heritability for meat quality traits can benefit from the use of WGS data.This study showed the superiority of using WGS data to genetically improve pork quality in genomic prediction.
基金Scientific and Technological Foundation of the National Administration of Traditional Chinese Medicine of China,No.02-03LP41the Scientific and Techno-logical Key Project of Guangdong Province,No.2006B35630007
文摘BACKGROUND: Natural cerebrolysin (NC), a Chinese herbal drug for the treatment of Alzheimer's disease (AD), induces mesenchymal stem cell (MSC) differentiation into neuron-like cells, with low toxicity. But the mechanisms involved in NC effects on MSCs remain poorly understood. OBJECTIVE: We used a whole genome microarray technique to further investigate the molecular, genetic, and pharmacodynamic mechanisms of NC on MSC gene expression profiles. DESIGN, TIME AND SETTING: A parallel, controlled, in vitro experiment was performed at the First Affiliated Hospital of Shenzhen University, Shenzhen Institute of Integrated Chinese and Western Medicine, China, between September 2006 and October 2008. MATERIALS: NC was provided by Shenzhen Institute of Integrated Chinese and Western Medicine China. It was predominantly composed of Renshen (Radix Ginseng), Tianma (Rhizoma Gastrodiae) and Yinxingye (Ginkgo Leaf) and prepared by conventional water extractJon technology. Twelve adult, male, New Zealand rabbits were included, six of which underwent intragastric administration of NC extract for 1 month to create NC-containing serum. METHODS: Bone marrow was collected from the tibia and femur of Sprague Dawley rats, aged 6 8 months old. Rat MSCs were isolated and purified by the whole bone marrow adherence method. After in vitro culture, MSCs from passage 4 were treated with NC-containing serum for 48 hours, and total RNA was extracted. Gene expression in MSCs was analyzed using Affymetrix whole genome microarray analysis. MAIN OUTCOME MEASURES: Differentially expressed genes in NC serum-treated MSCs. RESULTS: NC treated MSCs displayed 46 differentially expressed genes, 22 with upregulated expression (fold change 〉 2) and 24 with downregulated expression (fold change 〈 -2). Differentially expressed genes participated in neuronal growth, differentiation, and function, cell growth, differentiation, proliferation, apoptosis, signal transduction, substance/energy metabolism, ion transport, and immune responses. NC treatment changed levels of transforming growth factor β/ bone morphogenetic proteins, Hedgehog, Bmp, and Wntsignaling pathways, which regulate nerve cell differentiation, development and function, as well as learning and memory; Ras, G protein- coupled receptor signal pathways that are related to cell growth, proliferation, and apoptosis; and mitogen-activated protein kinase kinase kinase signaling cascades. CONCLUSION: NC can regulate gene expression for many signal transduction pathways related to nerve cell differentiation, development and function, learning and memory function, as well as regulation of cell growth, differentiation, proliferation, or apoptosis to mediate the genetic effects of NC treatment on AD.
基金supported by the National Key R&D Program of China(2016YFD0100201 and 2020YFE0202300)the Agricultural Science and Technology Innovation Program(ASTIP)of the Chinese Academy of Agricultural Sciences。
文摘With the development of sequencing technology, insertions-deletions(InDels) have been increasingly reported to be involved in the genetic determination of agronomical traits. However, most studies have focused on the identification and application of short-InDels(1–15 bp) for genetic analysis. The objective of this study was to deeply deploy long-InDels(>15 bp) for the genetic analysis of important agronomic traits in soybean. A total of 13 573 polymorphic long-InDels were identified between parents Zhongpin 03-5373(ZP) and Zhonghuang 13(ZH), which were unevenly distributed on 20 chromosomes of soybean, varying from 321 in Chr11 to 1 246 in Chr18. Consistent with the distribution pattern of annotated genes, the average density of long-InDels in arm regions was significantly higher than that in pericentromeric regions at the P=0.01 level. A total of 2 704(19.9% of total) long-InDels were located in genic regions, including 319 large-effect long-InDels, which resulted in truncated or elongated protein sequences. A previously identified QTL(qP H16) underlying plant height was further analyzed, and it was found that 26 out of 35(74.3%) long-InDel markers located in the qPH16 region showed clear polymorphisms between parents ZP and ZH. Seven markers, including three long-InDels and four previously reported SNP markers, were used to genotype 242 recombinant inbred lines derived from ZP×ZH. As a result, the qPH16 locus was narrowed from a 960-kb region to a 477.55-kb region, containing 65 annotated genes. Therefore, these long-InDels are a complementary genetic resource of SNPs and short-InDels for plant height and can facilitate genetic studies and molecular assisted selection breeding in soybean.
基金funded by a project(2014ZX10004002)of the Chinese National Key Program of Mega Infectious Disease of the National 12th Five-Year Plan
文摘Objective To evaluate a single-reaction genome amplification method, the multisegment reverse transcription-PCR (M-RTPCR), for its sensitivity to full genome sequencing of influenza A virus, and the ability to differentiate mix-subtype virus, using the next generation sequencing (NGS) platform. Methods Virus genome copy was quantified and seria(iy diluted to different titers, followed by amplification with the M-RTPCR method and sequencing on the NGS platform. Furthermore, we manually mixed two subtype viruses to different titer rate and amplified the mixed virus with the M-RTPCR protocol, followed by whole genome sequencing on the NGS platform. We also used clinical samples to test the method performance. Results The M-RTPCR method obtained complete genome of testing virus at 125 copies/reaction and determined the virus subtype at titer of 25 copies/reaction. Moreover, the two subtypes in the mixed virus could be discriminated, even though these two virus copies differed by 200-fold using this amplification protocol. The sensitivity of this protocol we detected using virus RNA was also confirmed with clinical samples containing Iow-titer virus. Conclusion The M-RTPCR is a robust and sensitive amplification method for whole genome sequencing of influenza A virus using NGS platform.
基金Supported by Science and Technology Innovation Leading Talent of Qingdao City(16-8-3-14-zhc)
文摘[Objective] The aim of this paper was to study the genotype, pathogenicity and nucleotide difference between Newcastle disease virus(NDV) isolates and traditional NDV vaccine strain(La Sota). [Method] A suspected NDV strain was isolated from a chicken farm. The isolate was preliminarily determined by HA and HI tests. A pair of primers was designed based on the partial sequence of NDV F gene published in GenBank(accession No. JF950510.1). F gene was amplified by RT-PCR, cloned and sequenced. The sequencing result was compared with the F gene sequences published in GenBank, and the phylogenetic tree was constructed to analyze the genotypes. The pathogenicity of the virus was determined by mean death time(MDT) of chicken embryos, intracerebral pathogenicity index(ICPI) of one-day-old chicks and intravenous pathogenicity index(IVPI) of six-week-old chickens, respectively. Based on the NDV genome sequence published in GenBank(accession No. JF950510.1), nine pairs of primers were designed to amplify the genome sequence of the isolate, and its structure was analyzed. [Result] The length of F gene was about 500 bp, and a NDV strain of genotype VII was isolated. The MDT, ICPI and IVPI were 52.8 h, 1.675 and 2.46, respectively, indicating the isolate was a virulent strain. The whole genome sequence analysis results showed that the full genome length of the isolate was 15 192 bp, which had 6 more nu-cleotides than that of La Sota strain, and the homology between the two strains was 82.8%. [Conclusion] A virulent NDV strain of genotype Ⅶ was isolated, with low homology to La Sota strain in nucleotide sequence.
基金This work was supported by National Natural Science Foundation of China (31371654, 31522042, 31501655 and 31501334), National Transgenic Proj-ect (grant 2014ZX08004003, 2015ZX08004003 and 2016ZX08004003), Agricultural Science and Technology Innovation Program, Breeding Project (SQ2016ZY03002375) and Wuhan Chenguang Plan (2015070404010193).
文摘Wild soybean resources, which are progenitor of cultivated soybean with selected agronomic characters, have rich genetic diversity. Here we used genome re-sequencing technology to analyze genetic variations between the wild soybean 'ED059’ and cultivar 'Tianlong 2'. In genome level, 3,214,319 and 1,519,765 single nucleotide polymorphisms (SNPs), 553,141 and 314,430 insertion/deletion polymorphisms (InDels), and 471,063 and 334,412 structural variations (SVs) were identified between 'ED0595' and 'Tianlong 2' respec-tively based on soybean (Glycine max L. Merr) reference genome. Base on gene annotation of reference genome, 68,830 (2.14%) and 34,570 (2.27%) non-synonymous SNPs, 8,478 and 4,826 frameshift substitution were detected in CDS regions of 'ED0595' and 'Tianlong 2'. 'ED059’ harbored much more specific genetic variations of jasmonic acid (JA), salicylic acid (SA) and ethylene (ET) biosynthesis and signal pathway genes than those in 'Tianlong 2' indicating its unique strong insect defense activity. This work provides important information allowing better understanding of the soybean genome and being helpful for dissecting the genetic basis of important traits such as insect defense in soybean.
文摘Gene duplications provide evolutionary potentials for generating novel functions, while polyploidization or whole genome duplication (WGD) doubles the chromosomes initially and results in hundreds to thousands of retained duplicates. WGDs are strongly supported by evidence commonly found in many species-rich lineages of eukaryotes, and thus are considered as a major driving force in species diversification. We per- formed comparative genomic and phylogenomic analyses of 59 public genomes/transcriptomes and 46 newly sequenced transcriptomes covering major lineages of angiosperms to detect large-scale gene dupli- cation events by surveying tens of thousands of gene family trees. These analyses confirmed most of the previously reported WGDs and provided strong evidence for novel ones in many lineages. The detected WGDs supported a model of exponential gene loss during evolution with an estimated half-life of approx- imately 21.6 million years, and were correlated with both the emergence of lineages with high degrees of diversification and periods of global climate changes. The new datasets and analyses detected many novel WGDs widely spread during angiosperm evolution, uncovered preferential retention of gene functions in essential cellular metabolisms, and provided clues for the roles of WGD in promoting angiosperm radiation and enhancing their adaptation to environmental changes.
基金supported by grants awarded to Yuanqing Yao by the Key Program of the "Twelfth Five-year plan" of People’s liberation Army(No.BWS11J058)the National High Technology Research and Development Program(SS2015AA020402)
文摘Reliable and accurate pre-implantation genetic diagnosis (PGD) of patient's embryos by next-generation sequencing (NGS) is dependent on efficient whole genome amplification (WGA) of a representative biopsy sample. However, the performance of the current state of the art WGA methods has not been evaluated for sequencing. Using low template DNA (15 pg) and single cells, we showed that the two PCR-based WGA systems SurePlex and MALBAC are superior to the REPLI-g WGA multiple displacement amplification (MDA) system in terms of consistent and reproducible genome coverage and sequence bias across the 24 chromosomes, allowing better normalization of test to reference sequencing data. When copy number variation sequencing (CNV-Seq) was applied to single cell WGA products derived by either SurePlex or MALBAC amplification, we showed that known disease CNVs in the range of 3-15 Mb could be reliably and accurately detected at the correct genomic positions. These findings indicate that our CNV-Seq pipeline incorporating either SurePlex or MALBAC as the key initial WGA step is a powerful methodology for clinical PGD to identify euploid embryos in a patient's cohort for uterine transplantation,
文摘Gene loss following whole genome duplication (WGD) is often biased, with one subgenome retaining more ancestral genes and the other sustaining more gene deletions. While bias toward the greater expression of gene copies on one subgenome can explain bias in gene loss, this raises the question to what drives differences in gene expression levels between subgenomes. Differences in chromatin modifications and epigenetic markers between subgenomes in several model species are now being identified, providing an explanation for bias in gene expression between subgenomes. WGDs can be classified into duplications with higher, biased gene loss and bias in gene expression between subgenomes versus those with lower, unbiased rates of gene loss and an absence of detectable bias between subgenomes; however, the origi- nally proposed link between these two classes and whether WGD results from an allo- or autopolyploid event is inconsistent with recent data from the allopolyploid Capsella bursa-pastoris. The gene balance hypothesis can explain bias in the functional categories of genes retained following WGD, the difference in gene loss rates between unbiased and biased WGDs, and how plant genomes have avoided being overrun with genes encoding dose-sensitive subunits of multiprotein complexes. Comparisons of gene expression patterns between retained transcription factor pairs in maize suggest the high degree of retention for WGD-derived pairs of transcription factors may instead be explained by the older duplication-degeneration-complementation model.
基金financially supported by National Natural Science Foundation of China (No. 31471899)the Knowledge Innovation Project of the Chinese Academy of Sciences (No. Y455421Z02)
文摘Sacred lotus(Nelumbo nucifera or lotus) is an important aquatic plant in horticulture and ecosystems. As a foundation for exploring genomic variation and evolution among different germplasms, we re-sequenced 19 individuals from three cultivated temperate lotus subgroups(rhizome,seed and flower lotus), one wild temperate lotus subgroup(wild lotus), one tropical lotus group(Thai lotus) and an outgroup(Nelumbo lutea). Through genetic diversity and polymorphism analysis by non-missing SNP sites widely distributed in the whole genome, we confirmed that wild and Thai lotus exhibited greater differentiation with a higher genomic diversity compared to cultivated lotus. Rhizome lotus had the lowest genomic diversity and a closer relationship to wild lotus, whereas the genomes of seed and flower lotus were admixed. Genes in energy metabolism process and plant immunity evolved rapidly in lotus, reflecting local adaptation.We established that candidate genes in genomic regions with significant differentiation associated with temperate and tropical lotus divergence always exhibited highly divergent expression pattern. Together, this study comprehensive and credible interpretates important patterns of genetic diversity and relationships, gene evolution, and genomic signature from ecotypic differentiation of sacred lotus.
基金Project supported by the National Basic Research Program (973) of China (No.2007CB948104)the Natural Science Foundation of Zhejiang Province,China (No.Z207021)
文摘Preimplantation genetic diagnosis (PGD) refers to a procedure for genetically analyzing embryos prior to implantation,improving the chance of conception for patients at high risk of transmitting specific inherited disorders.This method has been widely used for a large number of genetic disorders since the first successful application in the early 1990s.Polymerase chain reaction (PCR) and fluorescent in situ hybridization (FISH) are the two main methods in PGD,but there are some inevitable shortcomings limiting the scope of genetic diagnosis.Fortunately,different whole genome amplification (WGA) techniques have been developed to overcome these problems.Sufficient DNA can be amplified and multiple tasks which need abundant DNA can be performed.Moreover,WGA products can be analyzed as a template for multi-loci and multi-gene during the subsequent DNA analysis.In this review,we will focus on the currently available WGA techniques and their applications,as well as the new technical trends from WGA products.
基金Supported by the National Key Technologies Research and Development Program (Grant No. 2005BA711A09)from the Ministry of Science and Technology of China
文摘An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 com-plex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.
基金supported by the National Natural Science Foundation of China (31570155 and 31370199)"Young Top-notch Talents" of the Guangdong Province Special Support Program (2014)+3 种基金the Excellent Young Teacher Training Plan of Guangdong Province (Yq2013039)the Guangzhou Healthcare Collaborative Innovation Major Project (201400000002)funded by the China Scholarship Council (CSC No. 201508440056) as a Visiting Scholar (2015-2016)supported by a summer research grant to D.S. from the Office of the Vice President for Research at George Mason University
文摘Restriction endonuclease analysis(REA),or restriction fragment length polymorphism(RFLP),was useful for identifying and determining the relatedness and putative identities of microbial strains(Tang et al.,1997)and for characterizing and discriminating large numbers of samples inexpensively in the past。
基金funded by the Strategic Priority Research Program of the Chinese Academy of Sciences(No.XDA19050404)National Natural Science Foundation of China(No.31800174).
文摘Lycophytes are an ancient clade of the non-flowering vascular plants with chromosome numbers that vary from tens to hundreds.They are an excellent study system for examining whole-genome duplications(WGDs),or polyploidization,in spore-dispersed vascular plants.However,a lack of genome sequence data limits the reliable detection of very ancient WGDs,small-scale duplications(SSDs),and recent WGDs.Here,we integrated phylogenomic analysis and the distribution of synonymous substitutions per synonymous sites(Ks)of the transcriptomes of 13 species of lycophytes to identify,locate,and date multiple WGDs in the lycophyte family Lycopodiaceae.Additionally,we examined the genus Phlegmariurus for signs of genetic discordance,which can provide valuable insight into the underlying causes of such conflict(e.g.,hybridization,incomplete lineage sorting,or horizontal gene transfer).We found strong evidence that two WGD events occurred along the phylogenetic backbone of Lycopodiaceae,with one occurring in the common ancestor of extant Phlegmariurus(Lycopodiaceae)approximately 22-23 million years ago(Mya)and the other occurring in the common ancestor of Lycopodiaceae around 206-214 Mya.Interestingly,we found significant genetic discordance in the genus Phlegmariurus,indicating that the genus has a complex evolutionary history.This study provides molecular evidence for multiple WGDs in Lycopodiaceae and offers phylogenetic clues to the evolutionary history of Lycopodiaceae.
基金supported in part by AARI,AICML,ALIDF,iCORE,and NSERC
文摘An efficient rule-based algorithm is presented for haplotype inference from general pedigree genotype data, with the assumption of no recombination. This algorithm generalizes previous algorithms to handle the cases where some pedigree founders are not genotyped, provided that for each nuclear family at least one parent is genotyped and each non-genotyped founder appears in exactly one nuclear family. The importance of this generalization lies in that such cases frequently happen in real data, because some founders may have passed away and their genotype data can no longer be collected. The algorithm runs in O(m^3n^3) time, where m is the number of single nucleotide polymorphism (SNP) loci under consideration and n is the number of genotyped members in the pedigree. This zero-recombination haplotyping algorithm is extended to a maximum parsimoniously haplotyping algorithm in one whole genome scan to minimize the total number of breakpoint sites, or equivalently, the number of maximal zero-recombination chromosomal regions. We show that such a whole genome scan haplotyping algorithm can be implemented in O(m^3n^3) time in a novel incremental fashion, here m denotes the total number of SNP loci along the chromosome.