Research on peach genetic resources and breeding has achieved remarkable progress in recent decades,especially in China.In this review,we first described the geographic distribution,ecology,phenotypes,and genetic dive...Research on peach genetic resources and breeding has achieved remarkable progress in recent decades,especially in China.In this review,we first described the geographic distribution,ecology,phenotypes,and genetic diversity of peach landraces and wild relatives in China.We also discussed the almond.Subsequently,breeding programs of peaches in China are summarized,including breeding history,breeding targets,breeding institutes,elite breeding materials,breeding solutions,and domestically bred representative cultivars.Furthermore,we reviewed the genes or loci that have been mined using both linkage mapping and genome wide association study(GWAS)as well as the evolutionary genetics and domestication history of the peach.Finally,we gave our perspectives and suggestions for future breeding in terms of breeding material selection and breeding technology innovation.展开更多
Despite considerable advances in extracting crucial insights from bio-omics data to unravel the intricate mechanisms underlying complex traits,the absence of a universal multi-modal computational tool with robust inte...Despite considerable advances in extracting crucial insights from bio-omics data to unravel the intricate mechanisms underlying complex traits,the absence of a universal multi-modal computational tool with robust interpretability for accurate phenotype prediction and identification of trait-associated genes remains a challenge.This study introduces the dual-extraction modeling(DEM)approach,a multi-modal deep-learning architecture designed to extract representative features from heterogeneous omics datasets,enabling the prediction of complex trait phenotypes.Through comprehensive benchmarking experiments,we demonstrate the efficacy of DEM in classification and regression prediction of complex traits.DEM consistently exhibits superior accuracy,robustness,generalizability,and flexibility.Notably,we establish its effectiveness in predicting pleiotropic genes that influence both flowering time and rosette leaf number,underscoring its commendable interpretability.In addition,we have developed user-friendly software to facilitate seamless utilization of DEM’s functions.In summary,this study presents a state-of-the-art approach with the ability to effectively predict qualitative and quantitative traits and identify functional genes,confirming its potential as a valuable tool for exploring the genetic basis of complex traits.展开更多
A software and algorithm which based on random sequence model uses osmotic stress responding cis elements from existing information sources of biology was designed. It can infer the genic downstream function of Arabid...A software and algorithm which based on random sequence model uses osmotic stress responding cis elements from existing information sources of biology was designed. It can infer the genic downstream function of Arabidopsis thaliana through analyzing its promoter region, and can offer effective aided analysis to mine osmotic stress responding genes in Arabidopsis thatiana genome. The practical application proves that this software can aid to analyze vast genic data and offer important data evidence.展开更多
Soluble sugar content in seeds is an important quality trait of soybean. In this study, 57 quantitative trait loci(QTLs) related to soluble sugar contents in soybean seeds were collected from databases and published p...Soluble sugar content in seeds is an important quality trait of soybean. In this study, 57 quantitative trait loci(QTLs) related to soluble sugar contents in soybean seeds were collected from databases and published papers. After meta-overview-collinearity integrated analysis to refine QTL intervals, eight consensus QTLs were identified. To further verify the consensus QTLs, a population of chromosome segment substitution lines(CSSLs) was analyzed. Two lines containing fragments covering the regions of consensus QTLs and the recurrent parent were selected: one line showed high soluble sugar contents associated with a consensus QTL fragment, and the other line showed low soluble sugar contents. Transcriptome sequencing was conducted for these two lines at the early, middle, and late stages of seed development, which identified 158, 109 and 329 differentially expressed genes, respectively. Based on the analyses of re-sequencing data of the CSSLs and the consensus QTL region, three candidate genes(Glyma.19 G146800, Glyma.19 G122500, and Glyma.19 G128500) were identified in the genetic fragments introduced from wild soybean. Sequence comparisons between the two CSSL parents SN14 and ZYD00006 revealed a single nucleotide polymorphism(SNP) mutation in the coding sequence of Glyma.19 G122500, causing a nonsynonymous mutation in the amino acid sequence that affected the predicted protein structure. A Kompetitive allele-specific PCR(KASP) marker was developed based on this SNP and used to evaluate the CSSLs. These results lay the foundation for further research to identify genes related to soluble sugar contents in soybean seeds and for future soybean breeding.展开更多
In the post-genome-wide association study era,multi-omics techniques have shown great power and poten-tial for candidate gene mining and functional genomics research.However,due to the lack of effective data integrati...In the post-genome-wide association study era,multi-omics techniques have shown great power and poten-tial for candidate gene mining and functional genomics research.However,due to the lack of effective data integration and multi-omics analysis platforms,such techniques have not still been applied widely in rape-seed,an important oil crop worldwide.Here,we report a rapeseed multi-omics database(BnlR;http:/l yanglab.hzau.edu.cn/BnlR),which provides datasets of six omics including genomics,transcriptomics,variomics,epigenetics,phenomics,and metabolomics,as well as numerous"variation-gene expression-phenotype"associations by using multiple statistical methods.In addition,a series of multi-omics search and analysis tools are integrated to facilitate the browsing and application of these datasets.BnlR is the most comprehensive multi-omics database for rapeseed so far,and two case studies demonstrated its power to mine candidate genes associated with specific traits and analyze their potential regulatory mechanisms.展开更多
There are lots of biochemical reactions in the biosynthetic pathway without associated enzymes.Reactions predicted by retro-biosynthetic tools are not assigned gene sequences.Besides,non-natural reactions designed wit...There are lots of biochemical reactions in the biosynthetic pathway without associated enzymes.Reactions predicted by retro-biosynthetic tools are not assigned gene sequences.Besides,non-natural reactions designed with novel functions also lack suitable enzymes.All these reactions can be categorized as orphan reactions.The absence of protein-encoding genes in these orphan reactions limits their direct experimental implementation.Computational tools have been developed to find candidate enzymes for these orphan reactions.Herein,we discuss recent advances in these computational tools,including reaction similarity-based methods for calculating the substructural similarity between orphan reactions and known enzymatic reactions;sequence-based tools combine metabolic knowledge base and phenotypic information with genomic,transcriptomic,and metabolomic data to mine appropriate enzymes for orphan reactions;and approaches based on the creation of enzyme variants for orphan reactions as enzyme engineering modifications and de novo design of enzymes.We believe that our review will greatly facilitate the design of microbial cell factories and contribute to the development of the biomanufacturing field.展开更多
基金This work was financially supported by the Agricultural Science and Technology Innovation Program(Grant No.CAASASTIP-2019-ZFRI-01)the Crop Germplasm Resources Conservation Project(Grant No.2016NWB041)China Agriculture Research System(Grant No.CARS-30-1-04).
文摘Research on peach genetic resources and breeding has achieved remarkable progress in recent decades,especially in China.In this review,we first described the geographic distribution,ecology,phenotypes,and genetic diversity of peach landraces and wild relatives in China.We also discussed the almond.Subsequently,breeding programs of peaches in China are summarized,including breeding history,breeding targets,breeding institutes,elite breeding materials,breeding solutions,and domestically bred representative cultivars.Furthermore,we reviewed the genes or loci that have been mined using both linkage mapping and genome wide association study(GWAS)as well as the evolutionary genetics and domestication history of the peach.Finally,we gave our perspectives and suggestions for future breeding in terms of breeding material selection and breeding technology innovation.
基金supported by the National Natural Science Foundation of China(32370723,32000410)。
文摘Despite considerable advances in extracting crucial insights from bio-omics data to unravel the intricate mechanisms underlying complex traits,the absence of a universal multi-modal computational tool with robust interpretability for accurate phenotype prediction and identification of trait-associated genes remains a challenge.This study introduces the dual-extraction modeling(DEM)approach,a multi-modal deep-learning architecture designed to extract representative features from heterogeneous omics datasets,enabling the prediction of complex trait phenotypes.Through comprehensive benchmarking experiments,we demonstrate the efficacy of DEM in classification and regression prediction of complex traits.DEM consistently exhibits superior accuracy,robustness,generalizability,and flexibility.Notably,we establish its effectiveness in predicting pleiotropic genes that influence both flowering time and rosette leaf number,underscoring its commendable interpretability.In addition,we have developed user-friendly software to facilitate seamless utilization of DEM’s functions.In summary,this study presents a state-of-the-art approach with the ability to effectively predict qualitative and quantitative traits and identify functional genes,confirming its potential as a valuable tool for exploring the genetic basis of complex traits.
文摘A software and algorithm which based on random sequence model uses osmotic stress responding cis elements from existing information sources of biology was designed. It can infer the genic downstream function of Arabidopsis thaliana through analyzing its promoter region, and can offer effective aided analysis to mine osmotic stress responding genes in Arabidopsis thatiana genome. The practical application proves that this software can aid to analyze vast genic data and offer important data evidence.
基金financially supported by the National Natural Science Foundation of China(31701449,31971968,31971899,and 31501332)the Natural Science Foundation of Heilongjiang,China(QC2017013)+7 种基金the National Key R&D Program of China(2016YFD0100500,2016YFD0100300 and 2016YFD0100201-21)the Special Financial Aid to PostDoctor Research Fellow in Heilongjiang,China(LBHTZ1714)the International Postdoctoral Exchange Fellowship Program of China Postdoctoral Council(20180004)the China Post Doctoral Project,China(2015M581419)the Post-Doctoral Project of Northeast Agricultural University,China(NEAUBH-19002)the Heilongjiang Funds for Distinguished Young Scientists,China(JC2016004 and JC2017006)the Dongnongxuezhe Project,China(to Chen Qingshan)the the Backbone of Young Talent Scholar Project(to Qi Zhaoming,18XG01)of Northeast Agricultural University,China。
文摘Soluble sugar content in seeds is an important quality trait of soybean. In this study, 57 quantitative trait loci(QTLs) related to soluble sugar contents in soybean seeds were collected from databases and published papers. After meta-overview-collinearity integrated analysis to refine QTL intervals, eight consensus QTLs were identified. To further verify the consensus QTLs, a population of chromosome segment substitution lines(CSSLs) was analyzed. Two lines containing fragments covering the regions of consensus QTLs and the recurrent parent were selected: one line showed high soluble sugar contents associated with a consensus QTL fragment, and the other line showed low soluble sugar contents. Transcriptome sequencing was conducted for these two lines at the early, middle, and late stages of seed development, which identified 158, 109 and 329 differentially expressed genes, respectively. Based on the analyses of re-sequencing data of the CSSLs and the consensus QTL region, three candidate genes(Glyma.19 G146800, Glyma.19 G122500, and Glyma.19 G128500) were identified in the genetic fragments introduced from wild soybean. Sequence comparisons between the two CSSL parents SN14 and ZYD00006 revealed a single nucleotide polymorphism(SNP) mutation in the coding sequence of Glyma.19 G122500, causing a nonsynonymous mutation in the amino acid sequence that affected the predicted protein structure. A Kompetitive allele-specific PCR(KASP) marker was developed based on this SNP and used to evaluate the CSSLs. These results lay the foundation for further research to identify genes related to soluble sugar contents in soybean seeds and for future soybean breeding.
基金supported by the National Natural Science Foundation of China(32070559)the National Key Research and Development Plan of China(2021YFF1000100)+2 种基金the China Postdoctoral Science Foundation(2022M710875)the Hubei Hongshan Laboratory(2021HSZD004)and the Developing Bioinformatics Platform in Hainan Yazhou Bay Seed Lab(no.JBGS-B21HJ0001).
文摘In the post-genome-wide association study era,multi-omics techniques have shown great power and poten-tial for candidate gene mining and functional genomics research.However,due to the lack of effective data integration and multi-omics analysis platforms,such techniques have not still been applied widely in rape-seed,an important oil crop worldwide.Here,we report a rapeseed multi-omics database(BnlR;http:/l yanglab.hzau.edu.cn/BnlR),which provides datasets of six omics including genomics,transcriptomics,variomics,epigenetics,phenomics,and metabolomics,as well as numerous"variation-gene expression-phenotype"associations by using multiple statistical methods.In addition,a series of multi-omics search and analysis tools are integrated to facilitate the browsing and application of these datasets.BnlR is the most comprehensive multi-omics database for rapeseed so far,and two case studies demonstrated its power to mine candidate genes associated with specific traits and analyze their potential regulatory mechanisms.
基金supported by the National Natural Science Foundation of China(22138006).
文摘There are lots of biochemical reactions in the biosynthetic pathway without associated enzymes.Reactions predicted by retro-biosynthetic tools are not assigned gene sequences.Besides,non-natural reactions designed with novel functions also lack suitable enzymes.All these reactions can be categorized as orphan reactions.The absence of protein-encoding genes in these orphan reactions limits their direct experimental implementation.Computational tools have been developed to find candidate enzymes for these orphan reactions.Herein,we discuss recent advances in these computational tools,including reaction similarity-based methods for calculating the substructural similarity between orphan reactions and known enzymatic reactions;sequence-based tools combine metabolic knowledge base and phenotypic information with genomic,transcriptomic,and metabolomic data to mine appropriate enzymes for orphan reactions;and approaches based on the creation of enzyme variants for orphan reactions as enzyme engineering modifications and de novo design of enzymes.We believe that our review will greatly facilitate the design of microbial cell factories and contribute to the development of the biomanufacturing field.