With the development and decreasing cost of sequencing techniques, it is possible for scientists to conduct deeper research in phylogenomics. During the procedure of phylogenomic analysis, the mostimportant and vitale...With the development and decreasing cost of sequencing techniques, it is possible for scientists to conduct deeper research in phylogenomics. During the procedure of phylogenomic analysis, the mostimportant and vitalest step is orthology prediction, for that the prerequisite to phylogenetic reconstruction is that the genes being compared are orthologous. Here we briefly review the related concept of orthology anddifferent methods for orthology prediction. We also provide recommendations to give some advice for better selection of orthology prediction methods.展开更多
Essential proteins are those necessary for the survival or reproduction of species and discovering such essential proteins is fundamental for understanding the minimal requirements for cellular life, which is also mea...Essential proteins are those necessary for the survival or reproduction of species and discovering such essential proteins is fundamental for understanding the minimal requirements for cellular life, which is also meaningful to the disease study and drug design. With the development of high-throughput techniques, a large number of Protein-Protein Interactions(PPIs) can be used to identify essential proteins at the network level. Up to now, though a series of network-based computational methods have been proposed, it is still a challenge to improve the prediction precision as the high false positives in PPI networks. In this paper, we propose a new method GOS to identify essential proteins by integrating the Gene expressions, Orthology, and Subcellular localization information.The gene expressions and subcellular localization information are used to determine whether a neighbor in the PPI network is reliable. Only reliable neighbors are considered when we analyze the topological characteristics of a protein in a PPI network. We also analyze the orthologous attributes of each protein to reflect its conservative features, and use a random walk model to integrate a protein's topological characteristics and its orthology. The experimental results on the yeast PPI network show that the proposed method GOS outperforms the ten existing methods DC, BC, CC, SC, EC, IC, NC, Pe C, ION, and CSC.展开更多
Rice and wheat provide nearly 40%of human calorie and protein requirements.They share a common ancestor and belong to the Poaceae(grass)family.Characterizing their genetic homology is crucial for developing new cultiv...Rice and wheat provide nearly 40%of human calorie and protein requirements.They share a common ancestor and belong to the Poaceae(grass)family.Characterizing their genetic homology is crucial for developing new cultivars with enhanced traits.Several wheat genes and gene families have been characterized based on their rice orthologs.Rice–wheat orthology can identify genetic regions that regulate similar traits in both crops.Rice–wheat comparative genomics can identify candidate wheat genes in a genomic region identified by association or QTL mapping,deduce their putative functions and biochemical pathways,and develop molecular markers for marker-assisted breeding.A knowledge of gene homology facilitates the transfer between crops of genes or genomic regions associated with desirable traits by genetic engineering,gene editing,or wide crossing.展开更多
Background Photosystem II(PSII)constitutes an intricate assembly of protein pigments,featuring extrinsic and intrinsic polypeptides within the photosynthetic membrane.The low-molecular-weight transmembrane protein Psb...Background Photosystem II(PSII)constitutes an intricate assembly of protein pigments,featuring extrinsic and intrinsic polypeptides within the photosynthetic membrane.The low-molecular-weight transmembrane protein PsbX has been identified in PSII,which is associated with the oxygen-evolving complex.The expression of PsbX gene protein is regulated by light.PsbX’s central role involves the regulation of PSII,facilitating the binding of quinone molecules to the Qb(PsbA)site,and it additionally plays a crucial role in optimizing the efficiency of photosynthesis.Despite these insights,a comprehensive understanding of the PsbX gene’s functions has remained elusive.Results In this study,we identified ten PsbX genes in Gossypium hirsutum L.The phylogenetic analysis results showed that 40 genes from nine species were classified into one clade.The resulting sequence logos exhibited substantial conservation across the N and C terminals at multiple sites among all Gossypium species.Furthermore,the ortholo-gous/paralogous,Ka/Ks ratio revealed that cotton PsbX genes subjected to positive as well as purifying selection pressure might lead to limited divergence,which resulted in the whole genome and segmental duplication.The expression patterns of GhPsbX genes exhibited variations across specific tissues,as indicated by the analysis.Moreover,the expression of GhPsbX genes could potentially be regulated in response to salt,intense light,and drought stresses.Therefore,GhPsbX genes may play a significant role in the modulation of photosynthesis under adverse abiotic conditions.Conclusion We examined the structure and function of PsbX gene family very first by using comparative genom-ics and systems biology approaches in cotton.It seems that PsbX gene family plays a vital role during the growth and development of cotton under stress conditions.Collectively,the results of this study provide basic information to unveil the molecular and physiological function of PsbX genes of cotton plants.展开更多
The multidrug and toxic compound extrusion(MATE) family plays pivotal roles in the detoxification process in plants, while no information has been provided for this gene family in melon(Cucumis melo L.) thus far, limi...The multidrug and toxic compound extrusion(MATE) family plays pivotal roles in the detoxification process in plants, while no information has been provided for this gene family in melon(Cucumis melo L.) thus far, limiting our understanding of its functions in melon acclimation to stressful environments. In this study, a total of 39 MATEs(CmMATE1–CmMATE39) were observed in the melon genome;these were unevenly distributed in all chromosomes, with the most on Chromosome 1. Based on their orthologous relationship with those from Arabidopsis, rice, and sorghum, melon MATEs were clustered into three subfamilies of Clades Ⅰ, Ⅱ, and Ⅲ, wherein 23, 9, and 7 members were included, respectively.Variable exon number was observed in CmMATEs, and the most were harbored by CmMATE8. Gene ontology(GO) term and cis-regulatory element(CRE) analyses pointed to the potential roles of CmMATEs in both the regulation of melon development and acclimation to various abiotic and biotic stressors. The RNA-seq and qRT-PCR(quantitative real-time PCR) results demonstrated that under normal growth conditions, CmMATEs were expressed in a tissue-and development-specific manner, while their abundance apparently varied in a stress-dependent manner when melon plants were exposed to unfavorable environmental conditions. Altogether, these observations could expand our knowledge about the plant MATE family and benefit functional genomics analysis for CmMATEs in the future.展开更多
AIM: To identify alkyl hydroperoxide reductase subunit C(AhpC) homologs in Bacillus subtilis(B. subtilis) and to characterize their structural and biochemical properties. AhpC is responsible for the detoxification of ...AIM: To identify alkyl hydroperoxide reductase subunit C(AhpC) homologs in Bacillus subtilis(B. subtilis) and to characterize their structural and biochemical properties. AhpC is responsible for the detoxification of reactive oxygen species in bacteria.METHODS: Two AhpC homologs(AhpC_H1 and AhpC_H2) were identified by searching the B. subtilis database; these were then cloned and expressed in Escherichia coli. AhpC mutants carrying substitutions of catalytically important Cys residues(C37S, C47 S, C166 S, C37/47 S, C37/166 S, C47/166 S, and C37/47/166 S for AhpC_H1; C52 S, C169 S, and C52/169 S for AhpC_H2) were obtained by site-directed mutagenesis and purified, and their structure-function relationship was analyzed. The B. subtilis ahp C genes were disrupted by the short flanking homology method, and the phenotypes of the resulting AhpC-deficient bacteria were examined.RESULTS: Comparative characterization of AhpC homologs indicates that AhpC_H1 contains an extra C37, which forms a disulfide bond with the peroxidatic C47, and behaves like an atypical 2-Cys AhpC, while AhpC_H2 functions like a typical 2-Cys AhpC. Tryptic digestion analysis demonstrated the presence of intramolecular Cys37-Cys47 linkage, which could be reduced by thioredoxin, resulting in the association of the dimer into higher-molecular-mass complexes. Peroxidase activity analysis of Cys→Ser mutants indicated that three Cys residues were involved in the catalysis. AhpC_H1 was resistant to inactivation by peroxide substrates, but had lower activity at physiological H2O2 concentrations compared to AhpC_H2, suggesting that in B. subtilis, the enzymes may be physiologically functional at different substrate concentrations. The exposure to organic peroxides induced AhpC_H1 expression, while AhpC_H1-deficient mutants exhibited growth retardation in the stationary phase, suggesting the role of AhpC_H1 as an antioxidant scavenger of lipid hydroperoxides and a stress-response factor in B. subtilis. CONCLUSION: AhpC_H1, a novel atypical 2-Cys AhpC, is functionally distinct from AhpC_H2, a typical 2-Cys AhpC.展开更多
Microorganisms plays an important role in the growth of Pyropia haitanensis.To understand the structural and functional diversity of the microorganism community of P.haitanensis(PH40),the associated metabolic pathway ...Microorganisms plays an important role in the growth of Pyropia haitanensis.To understand the structural and functional diversity of the microorganism community of P.haitanensis(PH40),the associated metabolic pathway network in cluster of orthologous groups(COG)and Kyoto Encyclopedia of Genes and Genomes(KEGG),and carbohydrate-active enzymes(CAZymes)were explored in metagenomic analysis.DNA extraction from gametophytes of P.haitanensis was performed first,followed by library construction,sequencing,preprocessing of sequencing data,taxonomy assignment,gene prediction,and functional annotation.The results show that the predominant microorganisms of P.haitanensis were bacteria(98.98%),and the phylum with the highest abundance was Proteobacteria(54.64%),followed by Bacteroidetes(37.92%).Erythrobacter(3.98%)and Hyunsoonleella jejuensis(1.56%)were the genera and species with the highest abundance of bacteria,respectively.The COG annotation demonstrated that genes associated with microbial metabolism was the predominant category.The results of metabolic pathway annotation show that the ABC transport system and two-component system were the main pathways in the microbial community.Plant growth hormone biosynthesis pathway and multi-vitamin biosynthesis functional units(modules)were the other important pathways.The CAZyme annotation revealed that the starch might be an important carbon source for microorganisms.Glycosyl transferase family 2(GT2)and glycosyl transferase family 3(GT3)were the highly abundant families in glucoside transferase superfamily.Six metagenome-assembled genomes containing enzymes involved in the biosynthesis of cobalamin(vitamin B 12)and indole-3-acetic acid were obtained by binning method.They were confirmed to belong to Rhodobacterales and Rhizobiales,respectively.Our findings provide comprehensive insights into the microorganism community of Pyropia.展开更多
Understanding the genetic architecture of indi-vidual taxa of medical importance is the first step for designing disease preventive strategies. To understand the genetic details and evolu-tionary perspective of the mo...Understanding the genetic architecture of indi-vidual taxa of medical importance is the first step for designing disease preventive strategies. To understand the genetic details and evolu-tionary perspective of the model malaria vector, Anopheles gambiae and to use the information in other species of local importance, we scanned the published X-chromosome se-quence for detail characterization and obtain evolutionary status of different genes. The te-locentric X-chromosome contains 106 genes of known functions and 982 novel genes. Majori-ties of both the known and novel genes are with introns. The known genes are strictly biased towards less number of introns;about half of the total known genes have only one or two in-trons. The extreme sized (either long or short) genes were found to be most prevalent (58% short and 23% large). Statistically significant positive correlations between gene length and intron length as well as with intron number and intron length were obtained signifying the role of introns in contributing to the overall size of the known genes of X-chromosome in An. gam-biae. We compared each individual gene of An. gambiae with 33 other taxa having whole ge-nome sequence information. In general, the mosquito Aedes aegypti was found to be ge-netically closest and the yeast Saccharomyces cerevisiae as most distant taxa to An. gambiae. Further, only about a quarter of the known genes of X-chromosome were unique to An. gambiae and majorities have orthologs in dif-ferent taxa. A phylogenetic tree was constructed based on a single gene found to be highly orthologous across all the 34 taxa. Evolutionary relationships among 13 different taxa were in-ferred which corroborate the previous and pre-sent findings on genetic relationships across various taxa.展开更多
<i>Bacillus thuringiensis</i> (Bt) parasporal crystal proteins were well known to be toxic to certain insects and cytocidal activity against various human cancer cells. Bt serovar <i>coreanensis</...<i>Bacillus thuringiensis</i> (Bt) parasporal crystal proteins were well known to be toxic to certain insects and cytocidal activity against various human cancer cells. Bt serovar <i>coreanensis</i> ST7, non-pathogenic to insects and non-hemolytic, has an important parasporin, PS4Aa1 (Cry45Aa1), with potential toxicity to human cancer cells. In this study, we reported the feature of complete genome sequence and the cluster of orthologous groups of proteins function classification of ST7. Meanwhile, the evolutionary of ST7 was also studied. The genome data of ST7 will strongly contribute to a better understanding of the genomic diversity and evolution, and enrich the Bt genome database.展开更多
The advances accelerated by next-generation sequencing and long-read sequencing technologies continue to provide an impetus for plant phylogenetic study.In the past decade,a large number of phylogenetic studies adopti...The advances accelerated by next-generation sequencing and long-read sequencing technologies continue to provide an impetus for plant phylogenetic study.In the past decade,a large number of phylogenetic studies adopting hundreds to thousands of genes across a wealth of clades have emerged and ushered plant phylogenetics and evolution into a new era.In the meantime,a roadmap for researchers when making decisions across different approaches for their phylogenomic research design is imminent.This review focuses on the utility of genomic data(from organelle genomes,to both reduced representation sequencing and whole-genome sequencing) in phylogenetic and evolutionary investigations,describes the baseline methodology of experimental and analytical procedures,and summarizes recent progress in flowering plant phylogenomics at the ordinal,familial,tribal,and lower levels.We also discuss the challenges,such as the adverse impact on orthology inference and phylogenetic reconstruction raised from systematic errors,and underlying biological factors,such as whole-genome duplication,hybridization/introgression,and incomplete lineage sorting,together suggesting that a bifurcating tree may not be the best model for the tree of life.Finally,we discuss promising avenues for future plant phylogenomic studies.展开更多
Protein evolution proceeds by two distinct processes: 1) individual mutation and selection for adaptive mutations and 2) rearrangement of entire domains within proteins into novel combinations, producing new protei...Protein evolution proceeds by two distinct processes: 1) individual mutation and selection for adaptive mutations and 2) rearrangement of entire domains within proteins into novel combinations, producing new protein families that combine functional properties in ways that previously did not exist. Domain rearrangement poses a challenge to sequence alignment-based search methods, such as BLAST, in predicting homology since the methodology implicitly assumes that related proteins primarily differ from each other by individual mutations. Moreover, there is ample evidence that the evolutionary process has used (and continues to use) domains as building blocks, therefore, it seems fit to utilize computational, domain-based methods to reconstruct that process. A challenge and opportunity for computational biology is how to use knowledge of evolutionary domain recombination to characterize families of proteins whose evolutionary history includes such recombination, to discover novel proteins, and to infer protein-protein interactions. In this paper we review techniques and databases that exploit our growing knowledge of “horizontal” protein evolution, and suggest possible areas of future development. We illustrate the power of the domain-based methods and the possible directions of future development by a case history in progress aiming at facilitating a particular approach to understanding microbial pathogenicity.展开更多
Small RNAs (sRNAs) are non-coding transcripts exerting their functions in the cells directly. Identification of sRNAs is a difficult task due to the lack of clear sequence and structural biases. Most sRNAs are ident...Small RNAs (sRNAs) are non-coding transcripts exerting their functions in the cells directly. Identification of sRNAs is a difficult task due to the lack of clear sequence and structural biases. Most sRNAs are identified within genus specific intergenic regions in related genomes. However, several of these regions remain un-annotated due to lack of sequence homology and/or potent statistical identification tools. A computational engine has been built to search within the intergenic regions to identify and roughly annotate new putative sRNA regions in Enterobacteriaceae genomes. It utilizes experimentally known sRNA data and their flanking genes/KEGG Orthology (KO) numbers as templates to identify similar sRNA regions in related query genomes. The search engine not only has the capability to locate putative intergenic regions for specific sRNAs, but also has the potency to locate conserved, shuffled or deleted gene clusters in query genomes. Because it uses the KO terms for locating functionally important regions such as sRNAs, any further KO number assignment to additional genes will increase the sensitivity. The PsRNA server is used for the identification of putative sRNA regions through the information retrieved from the sRNA of interest. The computing engine is available online at http://bioserver 1 .physics.iisc.ernet.in/psrna/and http://bicmku.in: 8081/psrna/.展开更多
Various active components have been extracted from the root of Polygonum cuspidatum. However, the genetic basis for their activity is virtually unknown. In this study, 25600002 short reads (2.3 Gb) of P. cuspidatum ...Various active components have been extracted from the root of Polygonum cuspidatum. However, the genetic basis for their activity is virtually unknown. In this study, 25600002 short reads (2.3 Gb) of P. cuspidatum root transcriptome were obtained via lllumina HiSeq 2000 sequencing. A total of 86418 urtigenes were assembled de novo and annotated. Twelve, 18, 60 and 54 unigenes were respectively mapped to the mevalonic acid (MVA), methyl-D-erythritol 4-phosphate (MEP), shikimate and resveratrol biosynthesis pathways, suggesting that they are involved in the biosynthesis of pharmaceutically important anthra- quinone and resveratrol. Eighteen potential UDP-glycosyltransferase unigenes were identified as the candidates most likely to be involved in the biosynthesis of glycosides of secondary metabolites. Identification of relevant genes could be important in eventually increasing the yields of the medicinally useful constituents of the P. cuspidatum root. From the previously published transcriptome data of 19 non-model plant taxa, 1127 shared orthologs were identified and characterized. This information will be very useful for future functional, phylogenetic and evolutionary studies of these plants.展开更多
Glycosyltransferases (GTs; EC 2.4.x.y) constitute a large group of enzymes that form glycosidic bonds through transfer of sugars from activated donor molecules to acceptor molecules. GTs are critical to the biosynth...Glycosyltransferases (GTs; EC 2.4.x.y) constitute a large group of enzymes that form glycosidic bonds through transfer of sugars from activated donor molecules to acceptor molecules. GTs are critical to the biosynthesis of plant cell walls, among other diverse functions. Based on the Carbohydrate-Active enZymes (CAZy) database and sequence similarity.searches, we have identified 609 potential GT genes (loci) corresponding to 769 transcripts (gene models) in rice (Oryza sativa), the reference monocotyledonous species. Using domain composition and sequence similarity, these rice GTs were classified into 40 CAZy families plus an additional unknown class. We found that two Pfam domains of unknown function, PF04577 and PF04646, are associated with GT families GT61 and GT31, respectively. To facilitate functional analysis of this important and large gene family, we created a phylogenomic Rice GT Database (http://ricephylogenomics. ucdavis.edu/cellwalls/gtJ). Through the database, several classes of functional genomic data, including mutant lines and gene expression data, can be displayed for each rice GT in the context of a phylogenetic tree, allowing for comparative analysis both within and between GT families. Comprehensive digital expression analysis of public gene expression data revealed that most (-80%) rice GTs are expressed. Based on analysis with Inparanoid, we identified 282 ‘rice-diverged' GTs that lack orthologs in sequenced dicots (Arabidopsis thaliana, Populus tricocarpa, Medicago truncatula, and Ricinus communis). Combining these analyses, we identified 33 rice-diverged GT genes (45 gene models) that are highly expressed in above-ground, vegetative tissues. From the literature and this analysis, 21 of these loci are excellent targets for functional examination toward understanding and manipulating grass cell wall qualities. Study of the remainder may reveal aspects of hormone and protein metabolism that are critical for rice biology. This list of 33 genes and the Rice GT Database will facilitate the study of GTs and cell wall synthesis in rice and other plants.展开更多
RNA secondary structure plays a critical role in gene regulation. Rice (Oryza sativa) is one of the most important food crops in the world. However, RNA structure in rice has scarcely been studied. Here, we have suc...RNA secondary structure plays a critical role in gene regulation. Rice (Oryza sativa) is one of the most important food crops in the world. However, RNA structure in rice has scarcely been studied. Here, we have successfully generated in vivo Structure-seq libraries in rice. We found that the structural flexibility of mRNAs might associate with the dynamics of biological function. Higher N6-methyladenosine (mSA) modification tends to have less RNA structure in 3' UTR, whereas GC content does not significantly affect in vivo mRNA structure to maintain efficient biological processes such as translation. Comparative analysis of RNA structurome between rice and Arabidopsis revealed that higher GC content does not lead to stronger structure and less RNA structural flexibility. Moreover, we found a weak correlation between sequence and structure conservation of the orthologs between rice and Arabidopsis. The conservation and divergence of both sequence and in vivo RNA structure corresponds to diverse and specific biological processes. Our results indicate that RNA secondary structure might offer a separate layer of selection to the sequence between monocot and dicot. Therefore, our study implies that RNA structure evolves differently in various biological processes to maintain robustness in development and adaptational flexibility during angiosperm evolution.展开更多
Wheat(Triticum aestivum L.)is a staple food crop consumed by more than 30%of world population.Nitrogen(N)fertilizer has been applied broadly in agriculture practice to improve wheat yield to meet the growing demands f...Wheat(Triticum aestivum L.)is a staple food crop consumed by more than 30%of world population.Nitrogen(N)fertilizer has been applied broadly in agriculture practice to improve wheat yield to meet the growing demands for food production.However,undue N fertilizer application and the low N use efficiency(NUE)of modern wheat varieties are aggravating environmental pollution and ecological deterioration.Under nitrogen-limiting conditions,the rice(Oryza sativa)abnormal cytokinin response1 repressor1(are1)mutant exhibits increased NUE,delayed senescence and consequently,increased grain yield.However,the function of ARE1 ortholog in wheat remains unknown.Here,we isolated and characterized three TaARE1 homoeologs from the elite Chinese winter wheat cultivar ZhengMai 7698.We then used CRISPR/Cas9-mediated targeted mutagenesis to generate a series of transgene-free mutant lines either with partial or triple-null taare1 alleles.All transgene-free mutant lines showed enhanced tolerance to N starvation,and showed delayed senescence and increased grain yield in field conditions.In particular,the AABBdd and aabbDD mutant lines exhibited delayed senescence and significantly increased grain yield without growth defects compared to the wild-type control.Together,our results underscore the potential to manipulate ARE1 orthologs through gene editing for breeding of high-yield wheat as well as other cereal crops with improved NUE.展开更多
Magnetoreception is essential for magnetic orientation in animal migration. The molecular basis for magnetoreception has re- cently been elucidated in fruitfly as complexes between the magnetic receptor magnetorecept...Magnetoreception is essential for magnetic orientation in animal migration. The molecular basis for magnetoreception has re- cently been elucidated in fruitfly as complexes between the magnetic receptor magnetoreceptor (MagR) and its ligand crypto- chrome (Cry). MagR and Cry are present in the animal kingdom. However, it is unknown whether they perform a conserved role in diverse animals. Here we report the identification and expression of zebrafish MagR and Cry homologs towards under- standing their roles in lower vertebrates. A single rnagr gene and 7 cry genes are present in the zebrafish genome. Zebrafish has four cryl genes (crylaa, crylab, crylba and cry]bb) homologous to human CRY1 and a single ortholog of human CRY2 as well as 2 cry-like genes (cry4 and cryS). By RT-PCR, magr exhibited a high level of ubiquitous RNA expression in embryos and adult organs, whereas cry genes displayed differential embryonic and adult expression. Importantly, magr depletion did not produce apparent abnormalities in organogenesis. Taken together, magr and cry2 exist as a single copy gene, whereas cryl exists as multiple gene duplicates in zebrafish. Our result suggests that magr may play a dispensable role in organogenesis and predicts a possibility to generate rnagr mutants for analyzing its role in zebrafish.展开更多
基金supported by the National Natural Science Foundation of China (J0930005,30970350,31071959)
文摘With the development and decreasing cost of sequencing techniques, it is possible for scientists to conduct deeper research in phylogenomics. During the procedure of phylogenomic analysis, the mostimportant and vitalest step is orthology prediction, for that the prerequisite to phylogenetic reconstruction is that the genes being compared are orthologous. Here we briefly review the related concept of orthology anddifferent methods for orthology prediction. We also provide recommendations to give some advice for better selection of orthology prediction methods.
基金supported by the National Natural Science Foundation for Excellent Young Scholars(No.61622213)the National Natural Science Foundation of China(Nos.61232001,61370024,and 61428209)
文摘Essential proteins are those necessary for the survival or reproduction of species and discovering such essential proteins is fundamental for understanding the minimal requirements for cellular life, which is also meaningful to the disease study and drug design. With the development of high-throughput techniques, a large number of Protein-Protein Interactions(PPIs) can be used to identify essential proteins at the network level. Up to now, though a series of network-based computational methods have been proposed, it is still a challenge to improve the prediction precision as the high false positives in PPI networks. In this paper, we propose a new method GOS to identify essential proteins by integrating the Gene expressions, Orthology, and Subcellular localization information.The gene expressions and subcellular localization information are used to determine whether a neighbor in the PPI network is reliable. Only reliable neighbors are considered when we analyze the topological characteristics of a protein in a PPI network. We also analyze the orthologous attributes of each protein to reflect its conservative features, and use a random walk model to integrate a protein's topological characteristics and its orthology. The experimental results on the yeast PPI network show that the proposed method GOS outperforms the ten existing methods DC, BC, CC, SC, EC, IC, NC, Pe C, ION, and CSC.
文摘Rice and wheat provide nearly 40%of human calorie and protein requirements.They share a common ancestor and belong to the Poaceae(grass)family.Characterizing their genetic homology is crucial for developing new cultivars with enhanced traits.Several wheat genes and gene families have been characterized based on their rice orthologs.Rice–wheat orthology can identify genetic regions that regulate similar traits in both crops.Rice–wheat comparative genomics can identify candidate wheat genes in a genomic region identified by association or QTL mapping,deduce their putative functions and biochemical pathways,and develop molecular markers for marker-assisted breeding.A knowledge of gene homology facilitates the transfer between crops of genes or genomic regions associated with desirable traits by genetic engineering,gene editing,or wide crossing.
基金supported by National Natural Science Foundation of China(32060466)Chinese Academy of Agricultural Sciences。
文摘Background Photosystem II(PSII)constitutes an intricate assembly of protein pigments,featuring extrinsic and intrinsic polypeptides within the photosynthetic membrane.The low-molecular-weight transmembrane protein PsbX has been identified in PSII,which is associated with the oxygen-evolving complex.The expression of PsbX gene protein is regulated by light.PsbX’s central role involves the regulation of PSII,facilitating the binding of quinone molecules to the Qb(PsbA)site,and it additionally plays a crucial role in optimizing the efficiency of photosynthesis.Despite these insights,a comprehensive understanding of the PsbX gene’s functions has remained elusive.Results In this study,we identified ten PsbX genes in Gossypium hirsutum L.The phylogenetic analysis results showed that 40 genes from nine species were classified into one clade.The resulting sequence logos exhibited substantial conservation across the N and C terminals at multiple sites among all Gossypium species.Furthermore,the ortholo-gous/paralogous,Ka/Ks ratio revealed that cotton PsbX genes subjected to positive as well as purifying selection pressure might lead to limited divergence,which resulted in the whole genome and segmental duplication.The expression patterns of GhPsbX genes exhibited variations across specific tissues,as indicated by the analysis.Moreover,the expression of GhPsbX genes could potentially be regulated in response to salt,intense light,and drought stresses.Therefore,GhPsbX genes may play a significant role in the modulation of photosynthesis under adverse abiotic conditions.Conclusion We examined the structure and function of PsbX gene family very first by using comparative genom-ics and systems biology approaches in cotton.It seems that PsbX gene family plays a vital role during the growth and development of cotton under stress conditions.Collectively,the results of this study provide basic information to unveil the molecular and physiological function of PsbX genes of cotton plants.
基金supported by National Key Research and Development Program of China (Grant No. 2018YFD1000)Shandong Vegetable Research System (Grant No. SDAIT-05–05)+1 种基金Major Agricultural Application Technology Innovation Project of Shandong Province (2018)The Key Research and Development Program of Shandong and Chongqing Cooperation (Grant No. 2020LYXZ001)
文摘The multidrug and toxic compound extrusion(MATE) family plays pivotal roles in the detoxification process in plants, while no information has been provided for this gene family in melon(Cucumis melo L.) thus far, limiting our understanding of its functions in melon acclimation to stressful environments. In this study, a total of 39 MATEs(CmMATE1–CmMATE39) were observed in the melon genome;these were unevenly distributed in all chromosomes, with the most on Chromosome 1. Based on their orthologous relationship with those from Arabidopsis, rice, and sorghum, melon MATEs were clustered into three subfamilies of Clades Ⅰ, Ⅱ, and Ⅲ, wherein 23, 9, and 7 members were included, respectively.Variable exon number was observed in CmMATEs, and the most were harbored by CmMATE8. Gene ontology(GO) term and cis-regulatory element(CRE) analyses pointed to the potential roles of CmMATEs in both the regulation of melon development and acclimation to various abiotic and biotic stressors. The RNA-seq and qRT-PCR(quantitative real-time PCR) results demonstrated that under normal growth conditions, CmMATEs were expressed in a tissue-and development-specific manner, while their abundance apparently varied in a stress-dependent manner when melon plants were exposed to unfavorable environmental conditions. Altogether, these observations could expand our knowledge about the plant MATE family and benefit functional genomics analysis for CmMATEs in the future.
基金Supported by The Basic Science Research Program through the Korea Research Foundation Grant funded by the Ministry of Education,Science,and Technology(NRF-2011-0008913)Kim IH and Cha MK performed this work during their research sabbatical supported by Paichai University(2014-2015)
文摘AIM: To identify alkyl hydroperoxide reductase subunit C(AhpC) homologs in Bacillus subtilis(B. subtilis) and to characterize their structural and biochemical properties. AhpC is responsible for the detoxification of reactive oxygen species in bacteria.METHODS: Two AhpC homologs(AhpC_H1 and AhpC_H2) were identified by searching the B. subtilis database; these were then cloned and expressed in Escherichia coli. AhpC mutants carrying substitutions of catalytically important Cys residues(C37S, C47 S, C166 S, C37/47 S, C37/166 S, C47/166 S, and C37/47/166 S for AhpC_H1; C52 S, C169 S, and C52/169 S for AhpC_H2) were obtained by site-directed mutagenesis and purified, and their structure-function relationship was analyzed. The B. subtilis ahp C genes were disrupted by the short flanking homology method, and the phenotypes of the resulting AhpC-deficient bacteria were examined.RESULTS: Comparative characterization of AhpC homologs indicates that AhpC_H1 contains an extra C37, which forms a disulfide bond with the peroxidatic C47, and behaves like an atypical 2-Cys AhpC, while AhpC_H2 functions like a typical 2-Cys AhpC. Tryptic digestion analysis demonstrated the presence of intramolecular Cys37-Cys47 linkage, which could be reduced by thioredoxin, resulting in the association of the dimer into higher-molecular-mass complexes. Peroxidase activity analysis of Cys→Ser mutants indicated that three Cys residues were involved in the catalysis. AhpC_H1 was resistant to inactivation by peroxide substrates, but had lower activity at physiological H2O2 concentrations compared to AhpC_H2, suggesting that in B. subtilis, the enzymes may be physiologically functional at different substrate concentrations. The exposure to organic peroxides induced AhpC_H1 expression, while AhpC_H1-deficient mutants exhibited growth retardation in the stationary phase, suggesting the role of AhpC_H1 as an antioxidant scavenger of lipid hydroperoxides and a stress-response factor in B. subtilis. CONCLUSION: AhpC_H1, a novel atypical 2-Cys AhpC, is functionally distinct from AhpC_H2, a typical 2-Cys AhpC.
基金Supported by the National Key R&D Program of China(Nos.2018YFC1406704,2018YFD0900106,2018YFC1406700)the Marine S&T Fund of Shandong Province for Pilot National Laboratory for Marine Science and Technology(Qingdao)(No.2018SDKJ0302-4)the MOA Modern Agricultural Talents Support Project。
文摘Microorganisms plays an important role in the growth of Pyropia haitanensis.To understand the structural and functional diversity of the microorganism community of P.haitanensis(PH40),the associated metabolic pathway network in cluster of orthologous groups(COG)and Kyoto Encyclopedia of Genes and Genomes(KEGG),and carbohydrate-active enzymes(CAZymes)were explored in metagenomic analysis.DNA extraction from gametophytes of P.haitanensis was performed first,followed by library construction,sequencing,preprocessing of sequencing data,taxonomy assignment,gene prediction,and functional annotation.The results show that the predominant microorganisms of P.haitanensis were bacteria(98.98%),and the phylum with the highest abundance was Proteobacteria(54.64%),followed by Bacteroidetes(37.92%).Erythrobacter(3.98%)and Hyunsoonleella jejuensis(1.56%)were the genera and species with the highest abundance of bacteria,respectively.The COG annotation demonstrated that genes associated with microbial metabolism was the predominant category.The results of metabolic pathway annotation show that the ABC transport system and two-component system were the main pathways in the microbial community.Plant growth hormone biosynthesis pathway and multi-vitamin biosynthesis functional units(modules)were the other important pathways.The CAZyme annotation revealed that the starch might be an important carbon source for microorganisms.Glycosyl transferase family 2(GT2)and glycosyl transferase family 3(GT3)were the highly abundant families in glucoside transferase superfamily.Six metagenome-assembled genomes containing enzymes involved in the biosynthesis of cobalamin(vitamin B 12)and indole-3-acetic acid were obtained by binning method.They were confirmed to belong to Rhodobacterales and Rhizobiales,respectively.Our findings provide comprehensive insights into the microorganism community of Pyropia.
文摘Understanding the genetic architecture of indi-vidual taxa of medical importance is the first step for designing disease preventive strategies. To understand the genetic details and evolu-tionary perspective of the model malaria vector, Anopheles gambiae and to use the information in other species of local importance, we scanned the published X-chromosome se-quence for detail characterization and obtain evolutionary status of different genes. The te-locentric X-chromosome contains 106 genes of known functions and 982 novel genes. Majori-ties of both the known and novel genes are with introns. The known genes are strictly biased towards less number of introns;about half of the total known genes have only one or two in-trons. The extreme sized (either long or short) genes were found to be most prevalent (58% short and 23% large). Statistically significant positive correlations between gene length and intron length as well as with intron number and intron length were obtained signifying the role of introns in contributing to the overall size of the known genes of X-chromosome in An. gam-biae. We compared each individual gene of An. gambiae with 33 other taxa having whole ge-nome sequence information. In general, the mosquito Aedes aegypti was found to be ge-netically closest and the yeast Saccharomyces cerevisiae as most distant taxa to An. gambiae. Further, only about a quarter of the known genes of X-chromosome were unique to An. gambiae and majorities have orthologs in dif-ferent taxa. A phylogenetic tree was constructed based on a single gene found to be highly orthologous across all the 34 taxa. Evolutionary relationships among 13 different taxa were in-ferred which corroborate the previous and pre-sent findings on genetic relationships across various taxa.
文摘<i>Bacillus thuringiensis</i> (Bt) parasporal crystal proteins were well known to be toxic to certain insects and cytocidal activity against various human cancer cells. Bt serovar <i>coreanensis</i> ST7, non-pathogenic to insects and non-hemolytic, has an important parasporin, PS4Aa1 (Cry45Aa1), with potential toxicity to human cancer cells. In this study, we reported the feature of complete genome sequence and the cluster of orthologous groups of proteins function classification of ST7. Meanwhile, the evolutionary of ST7 was also studied. The genome data of ST7 will strongly contribute to a better understanding of the genomic diversity and evolution, and enrich the Bt genome database.
基金supported by the Priority Research Program of the Chinese Academy of Sciences (CAS) (Grant No.XDB31000000)Large-scale Scientific Facilities of the CAS (Grant No.2017LSF-GBOWS-2)。
文摘The advances accelerated by next-generation sequencing and long-read sequencing technologies continue to provide an impetus for plant phylogenetic study.In the past decade,a large number of phylogenetic studies adopting hundreds to thousands of genes across a wealth of clades have emerged and ushered plant phylogenetics and evolution into a new era.In the meantime,a roadmap for researchers when making decisions across different approaches for their phylogenomic research design is imminent.This review focuses on the utility of genomic data(from organelle genomes,to both reduced representation sequencing and whole-genome sequencing) in phylogenetic and evolutionary investigations,describes the baseline methodology of experimental and analytical procedures,and summarizes recent progress in flowering plant phylogenomics at the ordinal,familial,tribal,and lower levels.We also discuss the challenges,such as the adverse impact on orthology inference and phylogenetic reconstruction raised from systematic errors,and underlying biological factors,such as whole-genome duplication,hybridization/introgression,and incomplete lineage sorting,together suggesting that a bifurcating tree may not be the best model for the tree of life.Finally,we discuss promising avenues for future plant phylogenomic studies.
基金supported by NSF of USA under Grant Nos. 0835718 and 0235792NIH under Grant Nos. 5PN2EY016570-06 and5R01NS063405-02+2 种基金the Beckman Institute for Advanced Science and Technologythe National Center for Supercomputing Applicationsthe Renaissance Computing Institute
文摘Protein evolution proceeds by two distinct processes: 1) individual mutation and selection for adaptive mutations and 2) rearrangement of entire domains within proteins into novel combinations, producing new protein families that combine functional properties in ways that previously did not exist. Domain rearrangement poses a challenge to sequence alignment-based search methods, such as BLAST, in predicting homology since the methodology implicitly assumes that related proteins primarily differ from each other by individual mutations. Moreover, there is ample evidence that the evolutionary process has used (and continues to use) domains as building blocks, therefore, it seems fit to utilize computational, domain-based methods to reconstruct that process. A challenge and opportunity for computational biology is how to use knowledge of evolutionary domain recombination to characterize families of proteins whose evolutionary history includes such recombination, to discover novel proteins, and to infer protein-protein interactions. In this paper we review techniques and databases that exploit our growing knowledge of “horizontal” protein evolution, and suggest possible areas of future development. We illustrate the power of the domain-based methods and the possible directions of future development by a case history in progress aiming at facilitating a particular approach to understanding microbial pathogenicity.
基金funded by the Department of Biotechnology (DBT), Government of India
文摘Small RNAs (sRNAs) are non-coding transcripts exerting their functions in the cells directly. Identification of sRNAs is a difficult task due to the lack of clear sequence and structural biases. Most sRNAs are identified within genus specific intergenic regions in related genomes. However, several of these regions remain un-annotated due to lack of sequence homology and/or potent statistical identification tools. A computational engine has been built to search within the intergenic regions to identify and roughly annotate new putative sRNA regions in Enterobacteriaceae genomes. It utilizes experimentally known sRNA data and their flanking genes/KEGG Orthology (KO) numbers as templates to identify similar sRNA regions in related query genomes. The search engine not only has the capability to locate putative intergenic regions for specific sRNAs, but also has the potency to locate conserved, shuffled or deleted gene clusters in query genomes. Because it uses the KO terms for locating functionally important regions such as sRNAs, any further KO number assignment to additional genes will increase the sensitivity. The PsRNA server is used for the identification of putative sRNA regions through the information retrieved from the sRNA of interest. The computing engine is available online at http://bioserver 1 .physics.iisc.ernet.in/psrna/and http://bicmku.in: 8081/psrna/.
基金supported by the National Science and Technology Major Program (Grant No.2008ZX10005-004)
文摘Various active components have been extracted from the root of Polygonum cuspidatum. However, the genetic basis for their activity is virtually unknown. In this study, 25600002 short reads (2.3 Gb) of P. cuspidatum root transcriptome were obtained via lllumina HiSeq 2000 sequencing. A total of 86418 urtigenes were assembled de novo and annotated. Twelve, 18, 60 and 54 unigenes were respectively mapped to the mevalonic acid (MVA), methyl-D-erythritol 4-phosphate (MEP), shikimate and resveratrol biosynthesis pathways, suggesting that they are involved in the biosynthesis of pharmaceutically important anthra- quinone and resveratrol. Eighteen potential UDP-glycosyltransferase unigenes were identified as the candidates most likely to be involved in the biosynthesis of glycosides of secondary metabolites. Identification of relevant genes could be important in eventually increasing the yields of the medicinally useful constituents of the P. cuspidatum root. From the previously published transcriptome data of 19 non-model plant taxa, 1127 shared orthologs were identified and characterized. This information will be very useful for future functional, phylogenetic and evolutionary studies of these plants.
文摘Glycosyltransferases (GTs; EC 2.4.x.y) constitute a large group of enzymes that form glycosidic bonds through transfer of sugars from activated donor molecules to acceptor molecules. GTs are critical to the biosynthesis of plant cell walls, among other diverse functions. Based on the Carbohydrate-Active enZymes (CAZy) database and sequence similarity.searches, we have identified 609 potential GT genes (loci) corresponding to 769 transcripts (gene models) in rice (Oryza sativa), the reference monocotyledonous species. Using domain composition and sequence similarity, these rice GTs were classified into 40 CAZy families plus an additional unknown class. We found that two Pfam domains of unknown function, PF04577 and PF04646, are associated with GT families GT61 and GT31, respectively. To facilitate functional analysis of this important and large gene family, we created a phylogenomic Rice GT Database (http://ricephylogenomics. ucdavis.edu/cellwalls/gtJ). Through the database, several classes of functional genomic data, including mutant lines and gene expression data, can be displayed for each rice GT in the context of a phylogenetic tree, allowing for comparative analysis both within and between GT families. Comprehensive digital expression analysis of public gene expression data revealed that most (-80%) rice GTs are expressed. Based on analysis with Inparanoid, we identified 282 ‘rice-diverged' GTs that lack orthologs in sequenced dicots (Arabidopsis thaliana, Populus tricocarpa, Medicago truncatula, and Ricinus communis). Combining these analyses, we identified 33 rice-diverged GT genes (45 gene models) that are highly expressed in above-ground, vegetative tissues. From the literature and this analysis, 21 of these loci are excellent targets for functional examination toward understanding and manipulating grass cell wall qualities. Study of the remainder may reveal aspects of hormone and protein metabolism that are critical for rice biology. This list of 33 genes and the Rice GT Database will facilitate the study of GTs and cell wall synthesis in rice and other plants.
文摘RNA secondary structure plays a critical role in gene regulation. Rice (Oryza sativa) is one of the most important food crops in the world. However, RNA structure in rice has scarcely been studied. Here, we have successfully generated in vivo Structure-seq libraries in rice. We found that the structural flexibility of mRNAs might associate with the dynamics of biological function. Higher N6-methyladenosine (mSA) modification tends to have less RNA structure in 3' UTR, whereas GC content does not significantly affect in vivo mRNA structure to maintain efficient biological processes such as translation. Comparative analysis of RNA structurome between rice and Arabidopsis revealed that higher GC content does not lead to stronger structure and less RNA structural flexibility. Moreover, we found a weak correlation between sequence and structure conservation of the orthologs between rice and Arabidopsis. The conservation and divergence of both sequence and in vivo RNA structure corresponds to diverse and specific biological processes. Our results indicate that RNA secondary structure might offer a separate layer of selection to the sequence between monocot and dicot. Therefore, our study implies that RNA structure evolves differently in various biological processes to maintain robustness in development and adaptational flexibility during angiosperm evolution.
基金funded by National Key Research and Development Program of China(2020YFE0202300)the Agricultural Science and Technology Innovation Program(CAAS-ZDRW202109)+1 种基金Fundamental Research Funds for Central Non-Profit of Institute of Crop Sciences,Chinese Academy of Agricultural Sciences(S2021ZD03)National Engineering Laboratory of Crop Molecular Breeding。
文摘Wheat(Triticum aestivum L.)is a staple food crop consumed by more than 30%of world population.Nitrogen(N)fertilizer has been applied broadly in agriculture practice to improve wheat yield to meet the growing demands for food production.However,undue N fertilizer application and the low N use efficiency(NUE)of modern wheat varieties are aggravating environmental pollution and ecological deterioration.Under nitrogen-limiting conditions,the rice(Oryza sativa)abnormal cytokinin response1 repressor1(are1)mutant exhibits increased NUE,delayed senescence and consequently,increased grain yield.However,the function of ARE1 ortholog in wheat remains unknown.Here,we isolated and characterized three TaARE1 homoeologs from the elite Chinese winter wheat cultivar ZhengMai 7698.We then used CRISPR/Cas9-mediated targeted mutagenesis to generate a series of transgene-free mutant lines either with partial or triple-null taare1 alleles.All transgene-free mutant lines showed enhanced tolerance to N starvation,and showed delayed senescence and increased grain yield in field conditions.In particular,the AABBdd and aabbDD mutant lines exhibited delayed senescence and significantly increased grain yield without growth defects compared to the wild-type control.Together,our results underscore the potential to manipulate ARE1 orthologs through gene editing for breeding of high-yield wheat as well as other cereal crops with improved NUE.
基金supported by the National Natural Science Foundation of China (31572349, 31272396) to Yuequn Wangthe China Scholarship Council (201406720012) to Xiyang Peng+1 种基金the Cooperative Innovation Center of Engineering and New Products for Developmental Biology of Hunan Province (2013-448-6)the National Research Foundation of Singapore (NRF-CRP7-2010-03) to Yunhan Hong
文摘Magnetoreception is essential for magnetic orientation in animal migration. The molecular basis for magnetoreception has re- cently been elucidated in fruitfly as complexes between the magnetic receptor magnetoreceptor (MagR) and its ligand crypto- chrome (Cry). MagR and Cry are present in the animal kingdom. However, it is unknown whether they perform a conserved role in diverse animals. Here we report the identification and expression of zebrafish MagR and Cry homologs towards under- standing their roles in lower vertebrates. A single rnagr gene and 7 cry genes are present in the zebrafish genome. Zebrafish has four cryl genes (crylaa, crylab, crylba and cry]bb) homologous to human CRY1 and a single ortholog of human CRY2 as well as 2 cry-like genes (cry4 and cryS). By RT-PCR, magr exhibited a high level of ubiquitous RNA expression in embryos and adult organs, whereas cry genes displayed differential embryonic and adult expression. Importantly, magr depletion did not produce apparent abnormalities in organogenesis. Taken together, magr and cry2 exist as a single copy gene, whereas cryl exists as multiple gene duplicates in zebrafish. Our result suggests that magr may play a dispensable role in organogenesis and predicts a possibility to generate rnagr mutants for analyzing its role in zebrafish.