RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, an...RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, and facilitating the design of new RNAs. Machine learning (ML) techniques have made tremendous progress in many fields in the past few years. Although their usage in protein-related fields has a long history, the use of ML methods in predicting RNA tertiary structures is new and rare. Here, we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation, the difficulties and potentials of these approaches when applied in the field.展开更多
Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some st...Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some structure prediction models have been developed in recent years. In this review, the progress in computational models for RNA structure prediction is introduced and the distinguishing features of many outstanding algorithms are discussed, emphasizing three- dimensional (3D) structure prediction. A promising coarse-grained model for predicting RNA 3D structure, stability and salt effect is also introduced briefly. Finally, we discuss the major challenges in the RNA 3D structure modeling.展开更多
A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudokno...A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudoknots including the well known H-type pseudoknot were permitted to occur if reasonable.We have applied this approach to e number of RNA sequences.The prediction accuracies we obtained were higher than those in published papers.展开更多
RNAs play crucial and versatile roles in cellular biochemical reactions.Since experimental approaches of determining their three-dimensional(3D)structures are costly and less efficient,it is greatly advantageous to de...RNAs play crucial and versatile roles in cellular biochemical reactions.Since experimental approaches of determining their three-dimensional(3D)structures are costly and less efficient,it is greatly advantageous to develop computational methods to predict RNA 3D structures.For these methods,designing a model or scoring function for structure quality assessment is an essential step but this step poses challenges.In this study,we designed and trained a deep learning model to tackle this problem.The model was based on a graph convolutional network(GCN)and named RNAGCN.The model provided a natural way of representing RNA structures,avoided complex algorithms to preserve atomic rotational equivalence,and was capable of extracting features automatically out of structural patterns.Testing results on two datasets convincingly demonstrated that RNAGCN performs similarly to or better than four leading scoring functions.Our approach provides an alternative way of RNA tertiary structure assessment and may facilitate RNA structure predictions.RNAGCN can be downloaded from https://gitee.com/dcw-RNAGCN/rnagcn.展开更多
Knowledge of RNA 3-dimensional(3 D) structures is critical to understand the important biological functions of RNAs, and various models have been developed to predict RNA 3 D structures in silico. However, there is st...Knowledge of RNA 3-dimensional(3 D) structures is critical to understand the important biological functions of RNAs, and various models have been developed to predict RNA 3 D structures in silico. However, there is still lack of a reliable and efficient statistical potential for RNA 3 D structure evaluation. For this purpose, we developed a statistical potential based on a minimal coarse-grained representation and residue separation, where every nucleotide is represented by C4’ atom for backbone and N1(or N9) atom for base. In analogy to the newly developed all-atom rsRNASP, cgRNASP-CN is composed of short-ranged and long-ranged potentials, and the short-ranged one was involved more subtly. The examination indicates that the performance of cgRNASP-CN is close to that of the all-atom rsRNASP and is superior to other top all-atom traditional statistical potentials and scoring functions trained from neural networks, for two realistic test datasets including the RNA-Puzzles dataset. Very importantly,cgRNASP-CN is about 100 times more efficient than existing all-atom statistical potentials/scoring functions including rsRNASP. cgRNASP-CN is available at website: https://github.com/Tan-group/cgRNASP-CN.展开更多
Background: Structure profiling experiments provide single-nucleotide information on RNA structure. Recent advances in chemistry combined with application of high-throughput sequencing have enabled structure profilin...Background: Structure profiling experiments provide single-nucleotide information on RNA structure. Recent advances in chemistry combined with application of high-throughput sequencing have enabled structure profiling at transeriptome scale and in living cells, creating unprecedented opportunities for RNA biology. Propelled by these experimental advances, massive data with ever-increasing diversity and complexity have been generated, which give rise to new challenges in interpreting and analyzing these data. Results: We review current practices in analysis of structure profiling data with emphasis on comparative and integrative analysis as well as highlight emerging questions. Comparative analysis has revealed structural patterns across transcriptomes and has become an integral component of recent profiling studies. Additionally, profiling data can be integrated into traditional structure prediction algorithms to improve prediction accuracy. Conclusions: To keep pace with experimental developments, methods to facilitate, enhance and refine such analyses are needed. Parallel advances in analysis methodology will complement profiling technologies and help them reach their full potential.展开更多
Purpose–The purpose of this paper is to present a study of the effect of different types of annealing schedules for a ribonucleic acid(RNA)secondary structure prediction algorithm based on simulated annealing(SA).Des...Purpose–The purpose of this paper is to present a study of the effect of different types of annealing schedules for a ribonucleic acid(RNA)secondary structure prediction algorithm based on simulated annealing(SA).Design/methodology/approach–An RNA folding algorithm was implemented that assembles the final structure from potential substructures(helixes).Structures are encoded as a permutation of helixes.An SA searches this space of permutations.Parameters and annealing schedules were studied and fine-tuned to optimize algorithm performance.Findings–In comparing with mfold,the SA algorithm shows comparable results(in terms of F-measure)even with a less sophisticated thermodynamic model.In terms of average specificity,the SA algorithm has provided surpassing results.Research limitations/implications–Most of the underlying thermodynamic models are too simplistic and incomplete to accurately model the free energy for larger structures.This is the largest limitation of free energy-based RNA folding algorithms in general.Practical implications–The algorithm offers a different approach that can be used in practice to fold RNA sequences quickly.Originality/value–The algorithm is one of only two SA-based RNA folding algorithms.The authors use a very different encoding,based on permutation of candidate helixes.The in depth study of annealing schedules and other parameters makes the algorithm a strong contender.Another benefit is that new thermodynamic models can be incorporated with relative ease(which is not the case for algorithms based on dynamic programming).展开更多
基金Project supported by the National Natural Science Foundation of China (Grant Nos. 11774158, 11974173, 11774157, and 11934008)。
文摘RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, and facilitating the design of new RNAs. Machine learning (ML) techniques have made tremendous progress in many fields in the past few years. Although their usage in protein-related fields has a long history, the use of ML methods in predicting RNA tertiary structures is new and rare. Here, we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation, the difficulties and potentials of these approaches when applied in the field.
基金supported by the National Natural Science Foundation of China(Grant Nos.11074191,11175132,and 11374234)the National Basic Research Programof China(Grant No.2011CB933600)the Program for New Century Excellent Talents of China(Grant No.NCET 08-0408)
文摘Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some structure prediction models have been developed in recent years. In this review, the progress in computational models for RNA structure prediction is introduced and the distinguishing features of many outstanding algorithms are discussed, emphasizing three- dimensional (3D) structure prediction. A promising coarse-grained model for predicting RNA 3D structure, stability and salt effect is also introduced briefly. Finally, we discuss the major challenges in the RNA 3D structure modeling.
文摘A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudoknots including the well known H-type pseudoknot were permitted to occur if reasonable.We have applied this approach to e number of RNA sequences.The prediction accuracies we obtained were higher than those in published papers.
基金funded by the National Natural Science Foundation of China(Grant Nos.11774158 to JZ,11934008 to WW,and 11974173 to WFL)。
文摘RNAs play crucial and versatile roles in cellular biochemical reactions.Since experimental approaches of determining their three-dimensional(3D)structures are costly and less efficient,it is greatly advantageous to develop computational methods to predict RNA 3D structures.For these methods,designing a model or scoring function for structure quality assessment is an essential step but this step poses challenges.In this study,we designed and trained a deep learning model to tackle this problem.The model was based on a graph convolutional network(GCN)and named RNAGCN.The model provided a natural way of representing RNA structures,avoided complex algorithms to preserve atomic rotational equivalence,and was capable of extracting features automatically out of structural patterns.Testing results on two datasets convincingly demonstrated that RNAGCN performs similarly to or better than four leading scoring functions.Our approach provides an alternative way of RNA tertiary structure assessment and may facilitate RNA structure predictions.RNAGCN can be downloaded from https://gitee.com/dcw-RNAGCN/rnagcn.
基金supported by grants from the National Science Foundation of China(12075171,11774272)。
文摘Knowledge of RNA 3-dimensional(3 D) structures is critical to understand the important biological functions of RNAs, and various models have been developed to predict RNA 3 D structures in silico. However, there is still lack of a reliable and efficient statistical potential for RNA 3 D structure evaluation. For this purpose, we developed a statistical potential based on a minimal coarse-grained representation and residue separation, where every nucleotide is represented by C4’ atom for backbone and N1(or N9) atom for base. In analogy to the newly developed all-atom rsRNASP, cgRNASP-CN is composed of short-ranged and long-ranged potentials, and the short-ranged one was involved more subtly. The examination indicates that the performance of cgRNASP-CN is close to that of the all-atom rsRNASP and is superior to other top all-atom traditional statistical potentials and scoring functions trained from neural networks, for two realistic test datasets including the RNA-Puzzles dataset. Very importantly,cgRNASP-CN is about 100 times more efficient than existing all-atom statistical potentials/scoring functions including rsRNASP. cgRNASP-CN is available at website: https://github.com/Tan-group/cgRNASP-CN.
文摘Background: Structure profiling experiments provide single-nucleotide information on RNA structure. Recent advances in chemistry combined with application of high-throughput sequencing have enabled structure profiling at transeriptome scale and in living cells, creating unprecedented opportunities for RNA biology. Propelled by these experimental advances, massive data with ever-increasing diversity and complexity have been generated, which give rise to new challenges in interpreting and analyzing these data. Results: We review current practices in analysis of structure profiling data with emphasis on comparative and integrative analysis as well as highlight emerging questions. Comparative analysis has revealed structural patterns across transcriptomes and has become an integral component of recent profiling studies. Additionally, profiling data can be integrated into traditional structure prediction algorithms to improve prediction accuracy. Conclusions: To keep pace with experimental developments, methods to facilitate, enhance and refine such analyses are needed. Parallel advances in analysis methodology will complement profiling technologies and help them reach their full potential.
基金the NSERC for this research under Research Grant Number RG-PIN 238298Both authors would like to acknowledge the support of the InfoNet Media Centre funded by the Canadian Foundation for Innovation(CFI)under grant number CFI-3648.
文摘Purpose–The purpose of this paper is to present a study of the effect of different types of annealing schedules for a ribonucleic acid(RNA)secondary structure prediction algorithm based on simulated annealing(SA).Design/methodology/approach–An RNA folding algorithm was implemented that assembles the final structure from potential substructures(helixes).Structures are encoded as a permutation of helixes.An SA searches this space of permutations.Parameters and annealing schedules were studied and fine-tuned to optimize algorithm performance.Findings–In comparing with mfold,the SA algorithm shows comparable results(in terms of F-measure)even with a less sophisticated thermodynamic model.In terms of average specificity,the SA algorithm has provided surpassing results.Research limitations/implications–Most of the underlying thermodynamic models are too simplistic and incomplete to accurately model the free energy for larger structures.This is the largest limitation of free energy-based RNA folding algorithms in general.Practical implications–The algorithm offers a different approach that can be used in practice to fold RNA sequences quickly.Originality/value–The algorithm is one of only two SA-based RNA folding algorithms.The authors use a very different encoding,based on permutation of candidate helixes.The in depth study of annealing schedules and other parameters makes the algorithm a strong contender.Another benefit is that new thermodynamic models can be incorporated with relative ease(which is not the case for algorithms based on dynamic programming).