Small proteins specifically refer to proteins consisting of less than 100 amino acids translated from small open reading frames(s ORFs),which were usually missed in previous genome annotation.The significance of small...Small proteins specifically refer to proteins consisting of less than 100 amino acids translated from small open reading frames(s ORFs),which were usually missed in previous genome annotation.The significance of small proteins has been revealed in current years,along with the discovery of their diverse functions.However,systematic annotation of small proteins is still insufficient.Sm Prot was specially developed to provide valuable information on small proteins for scientific community.Here we present the update of Sm Prot,which emphasizes reliability of translated s ORFs,genetic variants in translated s ORFs,disease-specific s ORF translation events or sequences,and remarkably increased data volume.More components such as non-ATG translation initiation,function,and new sources are also included.Sm Prot incorporated638,958 unique small proteins curated from 3,165,229 primary records,which were computationally predicted from 419 ribosome profiling(Ribo-seq)datasets or collected from literature and other sources from 370 cell lines or tissues in 8 species(Homo sapiens,Mus musculus,Rattus norvegicus,Drosophila melanogaster,Danio rerio,Saccharomyces cerevisiae,Caenorhabditis elegans,and Escherichia coli).In addition,small protein families identified from human microbiomes were also collected.All datasets in Sm Prot are free to access,and available for browse,search,and bulk downloads at http://bigdata.ibp.ac.cn/SmProt/.展开更多
Heat shock response is a classical stress-induced regulatory system in bacteria, character- ized by extensive transcriptional reprogramming. To compare the impact of heat stress on the tran- scriptome and translatome ...Heat shock response is a classical stress-induced regulatory system in bacteria, character- ized by extensive transcriptional reprogramming. To compare the impact of heat stress on the tran- scriptome and translatome in Escherich& coli, we conducted ribosome profiling in parallel with RNA-Seq to investigate the alterations in transcription and translation efficiency when E. coli cells were exposed to a mild heat stress (from 30 ~C to 45 ~C). While general changes in ribosome foot- prints correlate with the changes of mRNA transcripts upon heat stress, a number of genes show differential changes at the transcription and translation levels. Translation efficiency of a few genes that are related to environment stimulus response is up-regulated, and in contrast, some genes func- tioning in mRNA translation and amino acid biosynthesis are down-regulated at the translation level in response to heat stress. Moreover, our ribosome occupancy data suggest that in generalribosomes accumulate remarkably in the starting regions of ORFs upon heat stress. This study pro- vides additional insights into bacterial gene expression in response to heat stress, and suggests the presence of stress-induced but yet-to-be characterized cellular regulatory mechanisms of gene expression at translation level.展开更多
Conventional peptides(CPs)and non-conventional peptides(NCPs)are generated from small open reading frames,but most CPs are derived from large precursors.NCPs,which are derived from sequences other than conventional op...Conventional peptides(CPs)and non-conventional peptides(NCPs)are generated from small open reading frames,but most CPs are derived from large precursors.NCPs,which are derived from sequences other than conventional open reading frames or annotated coding sequences regions,function in plant development and adaptation to stresses.Ribosome profiling,a technique for studying translational regulation,can be used to identify NCPs.Another new technique,peptidogenomics,which integrates mass spectrometry and genomics,is becoming more widely used for identifying plant NCPs.In recent years,numerous studies have investigated the roles in monocots and dicots of miRNA-derived peptides and upstream open reading frames,which have potential for improving agronomic traits.Investigating the biological functions of NCPs will advance molecular plant breeding by identifying regulators of plant growth and development.We present an overview of NCP identification methods and recent findings about NCP biological functions.展开更多
Background:A key step in gene expression is the recognition of the stop codon to terminate translation at the correct position.However,it has been observed that ribosomes can misinterpret the stop codon and continue t...Background:A key step in gene expression is the recognition of the stop codon to terminate translation at the correct position.However,it has been observed that ribosomes can misinterpret the stop codon and continue the translation in the 3′UTR region.This phenomenon is called stop codon read-through(SCR).It has been suggested that these events would occur on a programmed basis,but the underlying mechanisms are still not well understood.Methods:Here,we present a strategy for the comprehensive identification of SCR events in the Drosophila melanogaster transcriptome by evaluating the ribosomal density profiles.The associated ribosomal leak rate was estimated for every event identified.A statistical characterization of the frequency of nucleotide use in the proximal region to the stop codon in the sequences associated to SCR events was performed.Results:The results show that the nucleotide usage pattern in transcripts with the UGA codon is different from the pattern for those transcripts ending in the UAA codon,suggesting the existence of at least two mechanisms that could alter the translational termination process.Furthermore,a linear regression models for each of the three stop codons was developed,and we show that the models using the nucleotides at informative positions outperforms those models that consider the entire sequence context to the stop codon.Conclusions:We report that distal nucleotides can affect the SCR rate in a stop-codon dependent manner.展开更多
基金supported by the National Key R&D Program of China(Grant No.2016YFC0901702)National Natural Science Foundation of China(Grant Nos.81902519,91940306,31871294,31701117,and 31970647)+4 种基金the National Key R&D Program of China(Grant Nos.2017YFC0907503,2016YFC0901002,and 2018YFA0106901)the Strategic Priority Research Program of Chinese Academy of Sciences(Grant No.XDB38040300)the 13th Five-year Informatization Plan of Chinese Academy of Sciences(Grant No.XXH13505-05)Special Investigation on Science and Technology Basic Resources,Ministry of Science and Technology,China(Grant No.2019FY100102)the National Genomics Data Center,China。
文摘Small proteins specifically refer to proteins consisting of less than 100 amino acids translated from small open reading frames(s ORFs),which were usually missed in previous genome annotation.The significance of small proteins has been revealed in current years,along with the discovery of their diverse functions.However,systematic annotation of small proteins is still insufficient.Sm Prot was specially developed to provide valuable information on small proteins for scientific community.Here we present the update of Sm Prot,which emphasizes reliability of translated s ORFs,genetic variants in translated s ORFs,disease-specific s ORF translation events or sequences,and remarkably increased data volume.More components such as non-ATG translation initiation,function,and new sources are also included.Sm Prot incorporated638,958 unique small proteins curated from 3,165,229 primary records,which were computationally predicted from 419 ribosome profiling(Ribo-seq)datasets or collected from literature and other sources from 370 cell lines or tissues in 8 species(Homo sapiens,Mus musculus,Rattus norvegicus,Drosophila melanogaster,Danio rerio,Saccharomyces cerevisiae,Caenorhabditis elegans,and Escherichia coli).In addition,small protein families identified from human microbiomes were also collected.All datasets in Sm Prot are free to access,and available for browse,search,and bulk downloads at http://bigdata.ibp.ac.cn/SmProt/.
基金supported by the National Natural Science Foundation of China(Grant Nos.31630087,31422016,and 31470722 to NGGrant Nos.31671381 and 91540109 to XY)
文摘Heat shock response is a classical stress-induced regulatory system in bacteria, character- ized by extensive transcriptional reprogramming. To compare the impact of heat stress on the tran- scriptome and translatome in Escherich& coli, we conducted ribosome profiling in parallel with RNA-Seq to investigate the alterations in transcription and translation efficiency when E. coli cells were exposed to a mild heat stress (from 30 ~C to 45 ~C). While general changes in ribosome foot- prints correlate with the changes of mRNA transcripts upon heat stress, a number of genes show differential changes at the transcription and translation levels. Translation efficiency of a few genes that are related to environment stimulus response is up-regulated, and in contrast, some genes func- tioning in mRNA translation and amino acid biosynthesis are down-regulated at the translation level in response to heat stress. Moreover, our ribosome occupancy data suggest that in generalribosomes accumulate remarkably in the starting regions of ORFs upon heat stress. This study pro- vides additional insights into bacterial gene expression in response to heat stress, and suggests the presence of stress-induced but yet-to-be characterized cellular regulatory mechanisms of gene expression at translation level.
基金supported by the National Natural Science Foundation of China(31861143004)the Agricultural Science and Technology Innovation Program of CAAS to Wen-Xue Li.
文摘Conventional peptides(CPs)and non-conventional peptides(NCPs)are generated from small open reading frames,but most CPs are derived from large precursors.NCPs,which are derived from sequences other than conventional open reading frames or annotated coding sequences regions,function in plant development and adaptation to stresses.Ribosome profiling,a technique for studying translational regulation,can be used to identify NCPs.Another new technique,peptidogenomics,which integrates mass spectrometry and genomics,is becoming more widely used for identifying plant NCPs.In recent years,numerous studies have investigated the roles in monocots and dicots of miRNA-derived peptides and upstream open reading frames,which have potential for improving agronomic traits.Investigating the biological functions of NCPs will advance molecular plant breeding by identifying regulators of plant growth and development.We present an overview of NCP identification methods and recent findings about NCP biological functions.
基金LIE is funded by CONICET Ph.D.Fellowship.AMA and LD are researchers of CONICET(Argentina).JRR is Full Professor at the UNLP(Argentina).This work was supported by CONICET,Argentina(PIP2017-00059).
文摘Background:A key step in gene expression is the recognition of the stop codon to terminate translation at the correct position.However,it has been observed that ribosomes can misinterpret the stop codon and continue the translation in the 3′UTR region.This phenomenon is called stop codon read-through(SCR).It has been suggested that these events would occur on a programmed basis,but the underlying mechanisms are still not well understood.Methods:Here,we present a strategy for the comprehensive identification of SCR events in the Drosophila melanogaster transcriptome by evaluating the ribosomal density profiles.The associated ribosomal leak rate was estimated for every event identified.A statistical characterization of the frequency of nucleotide use in the proximal region to the stop codon in the sequences associated to SCR events was performed.Results:The results show that the nucleotide usage pattern in transcripts with the UGA codon is different from the pattern for those transcripts ending in the UAA codon,suggesting the existence of at least two mechanisms that could alter the translational termination process.Furthermore,a linear regression models for each of the three stop codons was developed,and we show that the models using the nucleotides at informative positions outperforms those models that consider the entire sequence context to the stop codon.Conclusions:We report that distal nucleotides can affect the SCR rate in a stop-codon dependent manner.