[Objective] To examine the grammar model based on lexical substring exac- tion for RNA secondary structure prediction. [Method] By introducing cloud model into stochastic grammar model, a machine learning algorithm su...[Objective] To examine the grammar model based on lexical substring exac- tion for RNA secondary structure prediction. [Method] By introducing cloud model into stochastic grammar model, a machine learning algorithm suitable for the lexicalized stochastic grammar model was proposed. The word grid mode was used to extract and divide RNA sequence to acquire lexical substring, and the cloud classifier was used to search the maximum probability of each lemma which was marked as a certain sec- ondary structure type. Then, the lemma information was introduced into the training stochastic grammar process as prior information, realizing the prediction on the sec- ondary structure of RNA, and the method was tested by experiment. [Result] The experimental results showed that the prediction accuracy and searching speed of stochastic grammar cloud model were significantly improved from the prediction with simple stochastic grammar. [Conclusion] This study laid the foundation for the wide application of stochastic grammar model for RNA secondary structure prediction.展开更多
Membrane proteins are an important kind of proteins embedded in the membranes of cells and play crucial roles in living organisms, such as ion channels,transporters, receptors. Because it is difficult to determinate t...Membrane proteins are an important kind of proteins embedded in the membranes of cells and play crucial roles in living organisms, such as ion channels,transporters, receptors. Because it is difficult to determinate the membrane protein's structure by wet-lab experiments,accurate and fast amino acid sequence-based computational methods are highly desired. In this paper, we report an online prediction tool called Mem Brain, whose input is the amino acid sequence. Mem Brain consists of specialized modules for predicting transmembrane helices, residue–residue contacts and relative accessible surface area of a-helical membrane proteins. Mem Brain achieves aprediction accuracy of 97.9% of ATMH, 87.1% of AP,3.2 ± 3.0 of N-score, 3.1 ± 2.8 of C-score. Mem BrainContact obtains 62%/64.1% prediction accuracy on training and independent dataset on top L/5 contact prediction,respectively. And Mem Brain-Rasa achieves Pearson correlation coefficient of 0.733 and its mean absolute error of13.593. These prediction results provide valuable hints for revealing the structure and function of membrane proteins.Mem Brain web server is free for academic use and available at www.csbio.sjtu.edu.cn/bioinf/Mem Brain/.展开更多
Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some st...Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some structure prediction models have been developed in recent years. In this review, the progress in computational models for RNA structure prediction is introduced and the distinguishing features of many outstanding algorithms are discussed, emphasizing three- dimensional (3D) structure prediction. A promising coarse-grained model for predicting RNA 3D structure, stability and salt effect is also introduced briefly. Finally, we discuss the major challenges in the RNA 3D structure modeling.展开更多
The algorithm based on combination learning usually is superior to a singleclassification algorithm on the task of protein secondary structure prediction. However,the assignment of the weight of the base classifier us...The algorithm based on combination learning usually is superior to a singleclassification algorithm on the task of protein secondary structure prediction. However,the assignment of the weight of the base classifier usually lacks decision-makingevidence. In this paper, we propose a protein secondary structure prediction method withdynamic self-adaptation combination strategy based on entropy, where the weights areassigned according to the entropy of posterior probabilities outputted by base classifiers.The higher entropy value means a lower weight for the base classifier. The final structureprediction is decided by the weighted combination of posterior probabilities. Extensiveexperiments on CB513 dataset demonstrates that the proposed method outperforms theexisting methods, which can effectively improve the prediction performance.展开更多
The structure type for the crystal of 4,4'-bis-(2-hydroxy-ethoxyl)-biphenyl 1 has been predicted by using the previously developed interfacial model for small organic molecules. Based on the calculated hydrophobic...The structure type for the crystal of 4,4'-bis-(2-hydroxy-ethoxyl)-biphenyl 1 has been predicted by using the previously developed interfacial model for small organic molecules. Based on the calculated hydrophobic to hydrophilic volume of 1, this model predicts the crystal structure to be of lamellar or bicontinuous type, which has been confirmed by the X-ray single-crystal structure analysis (C20H26O6, monoclinic, P21/C, a = 16.084(1), b = 6.0103(4), c = 9.6410(7) A, β9 = 103.014(2)°, V= 908.1(1) A3, Z = 2, Dc= 1.325 g/cm3, F(000)=388,μ = 0.097 mm-1, MoKα radiation, λ = 0.71073 A, R = 0.0382 and wR = 0.0882 with I > 2σ(I) for 7121 reflections collected, 1852 unique reflections and 170 parameters). As predicted, the hydrophobic and hydrophilic portions of 1 form in the lamellae. The same interfacial model is applied to other amphilphilic small molecule organic systems for structural type prediction.展开更多
Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure ...Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure prediction(PSSP)has a significant role in the prediction of protein tertiary structure,as it bridges the gap between the protein primary sequences and tertiary structure prediction.Protein secondary structures are classified into two categories:3-state category and 8-state category.Predicting the 3 states and the 8 states of secondary structures from protein sequences are called the Q3 prediction and the Q8 prediction problems,respectively.The 8 classes of secondary structures reveal more precise structural information for a variety of applications than the 3 classes of secondary structures,however,Q8 prediction has been found to be very challenging,that is why all previous work done in PSSP have focused on Q3 prediction.In this paper,we develop an ensemble Machine Learning(ML)approach for Q8 PSSP to explore the performance of ensemble learning algorithms compared to that of individual ML algorithms in Q8 PSSP.The ensemble members considered for constructing the ensemble models are well known classifiers,namely SVM(Support Vector Machines),KNN(K-Nearest Neighbor),DT(Decision Tree),RF(Random Forest),and NB(Naïve Bayes),with two feature extraction techniques,namely LDA(Linear Discriminate Analysis)and PCA(Principal Component Analysis).Experiments have been conducted for evaluating the performance of single models and ensemble models,with PCA and LDA,in Q8 PSSP.The novelty of this paper lies in the introduction of ensemble learning in Q8 PSSP problem.The experimental results confirmed that ensemble ML models are more accurate than individual ML models.They also indicated that features extracted by LDA are more effective than those extracted by PCA.展开更多
Secondary structures of RNAs are the basis of understanding their tertiary structures and functions and so their predictions are widely needed due to increasing discovery of noncoding RNAs.In the last decades,a lot of...Secondary structures of RNAs are the basis of understanding their tertiary structures and functions and so their predictions are widely needed due to increasing discovery of noncoding RNAs.In the last decades,a lot of methods have been proposed to predict RNA secondary structures but their accuracies encountered bottleneck.Here we present a method for RNA secondary structure prediction using direct coupling analysis and a remove-and-expand algorithm that shows better performance than four existing popular multiple-sequence methods.We further show that the results can also be used to improve the prediction accuracy of the single-sequence methods.展开更多
In this paper, the applications of evolutionary algorithm in prediction of protein secondary structure and tertiary structures are introduced, and recent studies on solving protein structure prediction problems using ...In this paper, the applications of evolutionary algorithm in prediction of protein secondary structure and tertiary structures are introduced, and recent studies on solving protein structure prediction problems using evolutionary algorithms are reviewed, and the challenges and prospects of EAs applied to protein structure modeling are analyzed and discussed.展开更多
A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudokno...A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudoknots including the well known H-type pseudoknot were permitted to occur if reasonable.We have applied this approach to e number of RNA sequences.The prediction accuracies we obtained were higher than those in published papers.展开更多
The architecture of a BioAccel (internal code) chip for RNA secondary structure prediction is described in the letter. The system is based on a BioBus (internal code), whose distinguishing features are: Two separated ...The architecture of a BioAccel (internal code) chip for RNA secondary structure prediction is described in the letter. The system is based on a BioBus (internal code), whose distinguishing features are: Two separated control and data channels, and a slave-associated arbitration scheme. Two reference systems based on the AMBA AHB bus and Coreconnect bus are introduced to evaluate the performance of the system. The simulation results are attractive. The average communication bandwidth of the chip is increased at severalfold, and the read and write latencies are reduced about 40 percent.展开更多
The hydrophobic-polar (HP) lattice model is an important simplified model for studying protein folding. In this paper, we present an improved ACO algorithm for the protein structure prediction. In the algorithm, the &...The hydrophobic-polar (HP) lattice model is an important simplified model for studying protein folding. In this paper, we present an improved ACO algorithm for the protein structure prediction. In the algorithm, the "lone"ethod is applied to deal with the infeasible structures, and the "oint mutation and reconstruction"ethod is applied in local search phase. The empirical results show that the presented method is feasible and effective to solve the problem of protein structure prediction, and notable improvements in CPU time are obtained.展开更多
A three-dimensional off-lattice protein model with two species of monomers, hydrophobic and hydrophilic, is studied. Enligh- tened by the law of reciprocity among things in the physical world, a heuristic quasi-physic...A three-dimensional off-lattice protein model with two species of monomers, hydrophobic and hydrophilic, is studied. Enligh- tened by the law of reciprocity among things in the physical world, a heuristic quasi-physical algorithm for protein structure prediction problem is put forward. First, by elaborately simulating the movement of the smooth elastic balls in the physical world, the algorithm finds low energy configurations for a given monomer chain. An "off-trap" strategy is then proposed to get out of local minima. Experimental results show promising performance. For all chains with lengths 13≤n ≤55, the proposed algorithm finds states with lower energy than the putative ground states reported in literatures. Furthermore, for chain lengths n = 21, 34, and 55, the algorithm finds new low energy configurations different from those given in literatures.展开更多
Protein structure prediction is an interdisciplinary research topic that has attracted researchers from multiple fields,including biochemistry,medicine,physics,mathematics,and computer science.These researchers adopt ...Protein structure prediction is an interdisciplinary research topic that has attracted researchers from multiple fields,including biochemistry,medicine,physics,mathematics,and computer science.These researchers adopt various research paradigms to attack the same structure prediction problem:biochemists and physicists attempt to reveal the principles governing protein folding;mathematicians,especially statisticians,usually start from assuming a probability distribution of protein structures given a target sequence and then find the most likely structure,while computer scientists formulate protein structure prediction as an optimization problem-finding the structural conformation with the lowest energy or minimizing the difference between predicted structure and native structure.These research paradigms fall into the two statistical modeling cultures proposed by Leo Breiman,namely,data modeling and algorithmic modeling.Recently,we have also witnessed the great success of deep learning in protein structure prediction.In this review,we present a survey of the efforts for protein structure prediction.We compare the research paradigms adopted by researchers from different fields,with an emphasis on the shift of research paradigms in the era of deep learning.In short,the algorithmic modeling techniques,especially deep neural networks,have considerably improved the accuracy of protein structure prediction;however,theories interpreting the neural networks and knowledge on protein folding are still highly desired.展开更多
Tannases produced by filamentous fungi are in a family of important hydrolases of gallotannins and have broad industry applications.But until now,the 3-D structures of fungi tannases have not been reported.The protein...Tannases produced by filamentous fungi are in a family of important hydrolases of gallotannins and have broad industry applications.But until now,the 3-D structures of fungi tannases have not been reported.The protein sequence deduced from the cDNA sequence obtained using RT-PCR amplification was identified as tannase through sequence alignment and phylogenetic analysis.Structure models based on the tannase sequence were collected using I-TASSER,and the model with the best match to the surface charge density-pH titration profile was selected as the final structure for tannase from Aspergillusniger N5-5.This work provides an effective method for protein structure research.The structure constructed in this work should be very important to understand the enzyme bioactivities and further developments of fungi tannases.展开更多
Crystal structure prediction algorithms have become powerful tools for materials discovery in recent years, however, they are usually limited to relatively small systems. The main challenge is that the number of local...Crystal structure prediction algorithms have become powerful tools for materials discovery in recent years, however, they are usually limited to relatively small systems. The main challenge is that the number of local minima grows exponentially with the system size. In this work, we proposed two crossover-mutation schemes based on graph theory to accelerate the evolutionary structure searching by automatic decomposition methods. These schemes can detect molecules or clusters inside periodic networks using quotient graphs for crystals, and the decomposition can dramatically reduce the searching space. Sufficient examples for test, including the high-pressure phases of methane, ammonia, MgAl2O4 and boron, show that these new evolution schemes can significantly improve the success rate and searching efficiency compared with the standard method in both isolated and extended systems.展开更多
RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, an...RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, and facilitating the design of new RNAs. Machine learning (ML) techniques have made tremendous progress in many fields in the past few years. Although their usage in protein-related fields has a long history, the use of ML methods in predicting RNA tertiary structures is new and rare. Here, we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation, the difficulties and potentials of these approaches when applied in the field.展开更多
Based on structure prediction method,the machine learning method is used instead of the density functional theory(DFT)method to predict the material properties,thereby accelerating the material search process.In this ...Based on structure prediction method,the machine learning method is used instead of the density functional theory(DFT)method to predict the material properties,thereby accelerating the material search process.In this paper,we established a data set of carbon materials by high-throughput calculation with available carbon structures obtained from the Samara Carbon Allotrope Database.We then trained a machine learning(ML)model that specifically predicts the elastic modulus(bulk modulus,shear modulus,and the Young's modulus)and confirmed that the accuracy is better than that of AFLOW-ML in predicting the elastic modulus of a carbon allotrope.We further combined our ML model with the CALYPSO code to search for new carbon structures with a high Young's modulus.A new carbon allotrope not included in the Samara Carbon Allotrope Database,named Cmcm-C24,which exhibits a hardness greater than 80 GPa,was firstly revealed.The Cmcm-C24 phase was identified as a semiconductor with a direct bandgap.The structural stability,elastic modulus,and electronic properties of the new carbon allotrope were systematically studied,and the obtained results demonstrate the feasibility of ML methods accelerating the material search process.展开更多
There is a large gap between the number of membrane protein (MP) sequencesand that of their decoded 3D structures, especially high-resolution structures, due to difficultiesin crystal preparation of MPs. However, deta...There is a large gap between the number of membrane protein (MP) sequencesand that of their decoded 3D structures, especially high-resolution structures, due to difficultiesin crystal preparation of MPs. However, detailed knowledge of the 3D structure is required for thefundamental understanding of the function of an MP and the interactions between the protein and itsinhibitors or activators. In this paper, some computational approaches that have been used topredict MP structures are discussed and compared.展开更多
The HP model for protein structure prediction abstracts the fact that hydrophobicity is a dominant force in the protein folding process. This challenging combinatorial optimization problem has been widely addressed th...The HP model for protein structure prediction abstracts the fact that hydrophobicity is a dominant force in the protein folding process. This challenging combinatorial optimization problem has been widely addressed through metaheuristics. The evaluation function is a key component for the success of metaheuristics; the poor discrimination of the conventional evaluation function of the HP model has motivated the proposal of alternative formulations for this component. This comparative analysis inquires into the effectiveness of seven different evaluation functions for the HP model. The degree of discrimination provided by each of the studied functions, their capability to preserve a rank ordering among potential solutions which is consistent with the original objective of the HP model, as well as their effect on the performance of local search methods are analyzed. The obtained results indicate that studying alternative evaluation schemes for the HP model represents a highly valuable direction which merits more attention.展开更多
Mineral apatite compounds have attracted significant interest due to their chemical stability and adjustable hexagonal structure,which makes them suitable as new photovoltaic functional materials.The band gap of natur...Mineral apatite compounds have attracted significant interest due to their chemical stability and adjustable hexagonal structure,which makes them suitable as new photovoltaic functional materials.The band gap of natural apatite is ~5.45 eV,and such a large value limits their applications in the field of catalysis and energy devices.In this research,we designed a method to narrow the band gap via the tetrahedral substitution effect in apatite-based compounds.The density functional theory(DFT) and experimental investigation of the electronic and optical properties revealed that the continuous incorporation of [MO_(4)]^(4-) tetrahedrons(M=Si,Ge,Sn,and Mn) into the crystal lattice can significantly reduce the band gap.In particular,this phenomenon was observed when the[MnO_(4)]^(4-) tetrahedron replaces the [PO_(4)]^(4-) tetrahedron because of the formation of a Mn 3 d-derived conduction band minimum(CBM) and interacts with other elements,leading to band broadening and obvious reduction of the band gap.This approach allowed us to propose a novel scheme in the band gap engineering of apatite-based compounds toward an entire spectral range modification.展开更多
基金Supported by the Science Foundation of Hengyang Normal University of China(09A36)~~
文摘[Objective] To examine the grammar model based on lexical substring exac- tion for RNA secondary structure prediction. [Method] By introducing cloud model into stochastic grammar model, a machine learning algorithm suitable for the lexicalized stochastic grammar model was proposed. The word grid mode was used to extract and divide RNA sequence to acquire lexical substring, and the cloud classifier was used to search the maximum probability of each lemma which was marked as a certain sec- ondary structure type. Then, the lemma information was introduced into the training stochastic grammar process as prior information, realizing the prediction on the sec- ondary structure of RNA, and the method was tested by experiment. [Result] The experimental results showed that the prediction accuracy and searching speed of stochastic grammar cloud model were significantly improved from the prediction with simple stochastic grammar. [Conclusion] This study laid the foundation for the wide application of stochastic grammar model for RNA secondary structure prediction.
基金supported by the National Natural Science Foundation of China(Nos.61671288,91530321,61603161)Science and Technology Commission of Shanghai Municipality(Nos.16JC1404300,17JC1403500,16ZR1448700)
文摘Membrane proteins are an important kind of proteins embedded in the membranes of cells and play crucial roles in living organisms, such as ion channels,transporters, receptors. Because it is difficult to determinate the membrane protein's structure by wet-lab experiments,accurate and fast amino acid sequence-based computational methods are highly desired. In this paper, we report an online prediction tool called Mem Brain, whose input is the amino acid sequence. Mem Brain consists of specialized modules for predicting transmembrane helices, residue–residue contacts and relative accessible surface area of a-helical membrane proteins. Mem Brain achieves aprediction accuracy of 97.9% of ATMH, 87.1% of AP,3.2 ± 3.0 of N-score, 3.1 ± 2.8 of C-score. Mem BrainContact obtains 62%/64.1% prediction accuracy on training and independent dataset on top L/5 contact prediction,respectively. And Mem Brain-Rasa achieves Pearson correlation coefficient of 0.733 and its mean absolute error of13.593. These prediction results provide valuable hints for revealing the structure and function of membrane proteins.Mem Brain web server is free for academic use and available at www.csbio.sjtu.edu.cn/bioinf/Mem Brain/.
基金supported by the National Natural Science Foundation of China(Grant Nos.11074191,11175132,and 11374234)the National Basic Research Programof China(Grant No.2011CB933600)the Program for New Century Excellent Talents of China(Grant No.NCET 08-0408)
文摘Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some structure prediction models have been developed in recent years. In this review, the progress in computational models for RNA structure prediction is introduced and the distinguishing features of many outstanding algorithms are discussed, emphasizing three- dimensional (3D) structure prediction. A promising coarse-grained model for predicting RNA 3D structure, stability and salt effect is also introduced briefly. Finally, we discuss the major challenges in the RNA 3D structure modeling.
文摘The algorithm based on combination learning usually is superior to a singleclassification algorithm on the task of protein secondary structure prediction. However,the assignment of the weight of the base classifier usually lacks decision-makingevidence. In this paper, we propose a protein secondary structure prediction method withdynamic self-adaptation combination strategy based on entropy, where the weights areassigned according to the entropy of posterior probabilities outputted by base classifiers.The higher entropy value means a lower weight for the base classifier. The final structureprediction is decided by the weighted combination of posterior probabilities. Extensiveexperiments on CB513 dataset demonstrates that the proposed method outperforms theexisting methods, which can effectively improve the prediction performance.
基金This work was supported by the National Science Foundation(Grant DMR-9812351)
文摘The structure type for the crystal of 4,4'-bis-(2-hydroxy-ethoxyl)-biphenyl 1 has been predicted by using the previously developed interfacial model for small organic molecules. Based on the calculated hydrophobic to hydrophilic volume of 1, this model predicts the crystal structure to be of lamellar or bicontinuous type, which has been confirmed by the X-ray single-crystal structure analysis (C20H26O6, monoclinic, P21/C, a = 16.084(1), b = 6.0103(4), c = 9.6410(7) A, β9 = 103.014(2)°, V= 908.1(1) A3, Z = 2, Dc= 1.325 g/cm3, F(000)=388,μ = 0.097 mm-1, MoKα radiation, λ = 0.71073 A, R = 0.0382 and wR = 0.0882 with I > 2σ(I) for 7121 reflections collected, 1852 unique reflections and 170 parameters). As predicted, the hydrophobic and hydrophilic portions of 1 form in the lamellae. The same interfacial model is applied to other amphilphilic small molecule organic systems for structural type prediction.
文摘Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure prediction(PSSP)has a significant role in the prediction of protein tertiary structure,as it bridges the gap between the protein primary sequences and tertiary structure prediction.Protein secondary structures are classified into two categories:3-state category and 8-state category.Predicting the 3 states and the 8 states of secondary structures from protein sequences are called the Q3 prediction and the Q8 prediction problems,respectively.The 8 classes of secondary structures reveal more precise structural information for a variety of applications than the 3 classes of secondary structures,however,Q8 prediction has been found to be very challenging,that is why all previous work done in PSSP have focused on Q3 prediction.In this paper,we develop an ensemble Machine Learning(ML)approach for Q8 PSSP to explore the performance of ensemble learning algorithms compared to that of individual ML algorithms in Q8 PSSP.The ensemble members considered for constructing the ensemble models are well known classifiers,namely SVM(Support Vector Machines),KNN(K-Nearest Neighbor),DT(Decision Tree),RF(Random Forest),and NB(Naïve Bayes),with two feature extraction techniques,namely LDA(Linear Discriminate Analysis)and PCA(Principal Component Analysis).Experiments have been conducted for evaluating the performance of single models and ensemble models,with PCA and LDA,in Q8 PSSP.The novelty of this paper lies in the introduction of ensemble learning in Q8 PSSP problem.The experimental results confirmed that ensemble ML models are more accurate than individual ML models.They also indicated that features extracted by LDA are more effective than those extracted by PCA.
基金Project supported by the National Natural Science Foundation of China(Grant No.31570722).
文摘Secondary structures of RNAs are the basis of understanding their tertiary structures and functions and so their predictions are widely needed due to increasing discovery of noncoding RNAs.In the last decades,a lot of methods have been proposed to predict RNA secondary structures but their accuracies encountered bottleneck.Here we present a method for RNA secondary structure prediction using direct coupling analysis and a remove-and-expand algorithm that shows better performance than four existing popular multiple-sequence methods.We further show that the results can also be used to improve the prediction accuracy of the single-sequence methods.
基金Supported by the National Natural Science Foundation of China(60133010,70071042,60073043)
文摘In this paper, the applications of evolutionary algorithm in prediction of protein secondary structure and tertiary structures are introduced, and recent studies on solving protein structure prediction problems using evolutionary algorithms are reviewed, and the challenges and prospects of EAs applied to protein structure modeling are analyzed and discussed.
文摘A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudoknots including the well known H-type pseudoknot were permitted to occur if reasonable.We have applied this approach to e number of RNA sequences.The prediction accuracies we obtained were higher than those in published papers.
基金Supported by the National Natrual Science Foundation of China (No.60373044) and Knowl-edge Innovative Project of CAS (No.KSCX2-SW-233).
文摘The architecture of a BioAccel (internal code) chip for RNA secondary structure prediction is described in the letter. The system is based on a BioBus (internal code), whose distinguishing features are: Two separated control and data channels, and a slave-associated arbitration scheme. Two reference systems based on the AMBA AHB bus and Coreconnect bus are introduced to evaluate the performance of the system. The simulation results are attractive. The average communication bandwidth of the chip is increased at severalfold, and the read and write latencies are reduced about 40 percent.
文摘The hydrophobic-polar (HP) lattice model is an important simplified model for studying protein folding. In this paper, we present an improved ACO algorithm for the protein structure prediction. In the algorithm, the "lone"ethod is applied to deal with the infeasible structures, and the "oint mutation and reconstruction"ethod is applied in local search phase. The empirical results show that the presented method is feasible and effective to solve the problem of protein structure prediction, and notable improvements in CPU time are obtained.
基金The National Natural Science Founda-tion of China (No.10471051) and the National Basic Research Program (973) of China (No.2004CB318000)
文摘A three-dimensional off-lattice protein model with two species of monomers, hydrophobic and hydrophilic, is studied. Enligh- tened by the law of reciprocity among things in the physical world, a heuristic quasi-physical algorithm for protein structure prediction problem is put forward. First, by elaborately simulating the movement of the smooth elastic balls in the physical world, the algorithm finds low energy configurations for a given monomer chain. An "off-trap" strategy is then proposed to get out of local minima. Experimental results show promising performance. For all chains with lengths 13≤n ≤55, the proposed algorithm finds states with lower energy than the putative ground states reported in literatures. Furthermore, for chain lengths n = 21, 34, and 55, the algorithm finds new low energy configurations different from those given in literatures.
基金the National Key R&D Program of China(Grant No.2020YFA0907000)lthe National Natural Science Foundation of China(Grant Nos.32271297,62072435,31770775,and 31671369)for providing financial support for this study and publication charges.
文摘Protein structure prediction is an interdisciplinary research topic that has attracted researchers from multiple fields,including biochemistry,medicine,physics,mathematics,and computer science.These researchers adopt various research paradigms to attack the same structure prediction problem:biochemists and physicists attempt to reveal the principles governing protein folding;mathematicians,especially statisticians,usually start from assuming a probability distribution of protein structures given a target sequence and then find the most likely structure,while computer scientists formulate protein structure prediction as an optimization problem-finding the structural conformation with the lowest energy or minimizing the difference between predicted structure and native structure.These research paradigms fall into the two statistical modeling cultures proposed by Leo Breiman,namely,data modeling and algorithmic modeling.Recently,we have also witnessed the great success of deep learning in protein structure prediction.In this review,we present a survey of the efforts for protein structure prediction.We compare the research paradigms adopted by researchers from different fields,with an emphasis on the shift of research paradigms in the era of deep learning.In short,the algorithmic modeling techniques,especially deep neural networks,have considerably improved the accuracy of protein structure prediction;however,theories interpreting the neural networks and knowledge on protein folding are still highly desired.
基金the National Natural Science Foundation of China (No. 21374117)the 100 Talents Program of Chinese Academy of Sciences for financial support
文摘Tannases produced by filamentous fungi are in a family of important hydrolases of gallotannins and have broad industry applications.But until now,the 3-D structures of fungi tannases have not been reported.The protein sequence deduced from the cDNA sequence obtained using RT-PCR amplification was identified as tannase through sequence alignment and phylogenetic analysis.Structure models based on the tannase sequence were collected using I-TASSER,and the model with the best match to the surface charge density-pH titration profile was selected as the final structure for tannase from Aspergillusniger N5-5.This work provides an effective method for protein structure research.The structure constructed in this work should be very important to understand the enzyme bioactivities and further developments of fungi tannases.
基金support from the National Natural Science Foundation of China (Grant Nos. 11974162 and 11834006)the National Key R&D Program of China (Grant Nos. 2016YFA0300404)the Fundamental Research Funds for the Central Universities.
文摘Crystal structure prediction algorithms have become powerful tools for materials discovery in recent years, however, they are usually limited to relatively small systems. The main challenge is that the number of local minima grows exponentially with the system size. In this work, we proposed two crossover-mutation schemes based on graph theory to accelerate the evolutionary structure searching by automatic decomposition methods. These schemes can detect molecules or clusters inside periodic networks using quotient graphs for crystals, and the decomposition can dramatically reduce the searching space. Sufficient examples for test, including the high-pressure phases of methane, ammonia, MgAl2O4 and boron, show that these new evolution schemes can significantly improve the success rate and searching efficiency compared with the standard method in both isolated and extended systems.
基金Project supported by the National Natural Science Foundation of China (Grant Nos. 11774158, 11974173, 11774157, and 11934008)。
文摘RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, and facilitating the design of new RNAs. Machine learning (ML) techniques have made tremendous progress in many fields in the past few years. Although their usage in protein-related fields has a long history, the use of ML methods in predicting RNA tertiary structures is new and rare. Here, we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation, the difficulties and potentials of these approaches when applied in the field.
基金This work was financlally supported by the Fundamental Research Funds for the Central Universities,the Na-tional Natural Science Foundation of China(Grant Nos.11965005 and 11964026)the 111 Project(No.B17035)the Natural Sci-ence Basie Research plan in Shaanxi Province of China(Grant Nos.2020JM-186 and 2020JM-621).
文摘Based on structure prediction method,the machine learning method is used instead of the density functional theory(DFT)method to predict the material properties,thereby accelerating the material search process.In this paper,we established a data set of carbon materials by high-throughput calculation with available carbon structures obtained from the Samara Carbon Allotrope Database.We then trained a machine learning(ML)model that specifically predicts the elastic modulus(bulk modulus,shear modulus,and the Young's modulus)and confirmed that the accuracy is better than that of AFLOW-ML in predicting the elastic modulus of a carbon allotrope.We further combined our ML model with the CALYPSO code to search for new carbon structures with a high Young's modulus.A new carbon allotrope not included in the Samara Carbon Allotrope Database,named Cmcm-C24,which exhibits a hardness greater than 80 GPa,was firstly revealed.The Cmcm-C24 phase was identified as a semiconductor with a direct bandgap.The structural stability,elastic modulus,and electronic properties of the new carbon allotrope were systematically studied,and the obtained results demonstrate the feasibility of ML methods accelerating the material search process.
文摘There is a large gap between the number of membrane protein (MP) sequencesand that of their decoded 3D structures, especially high-resolution structures, due to difficultiesin crystal preparation of MPs. However, detailed knowledge of the 3D structure is required for thefundamental understanding of the function of an MP and the interactions between the protein and itsinhibitors or activators. In this paper, some computational approaches that have been used topredict MP structures are discussed and compared.
基金partially supported by the National Council of Science and Technology of México (CO NACyT) under Grant Nos. 105060 and 99276
文摘The HP model for protein structure prediction abstracts the fact that hydrophobicity is a dominant force in the protein folding process. This challenging combinatorial optimization problem has been widely addressed through metaheuristics. The evaluation function is a key component for the success of metaheuristics; the poor discrimination of the conventional evaluation function of the HP model has motivated the proposal of alternative formulations for this component. This comparative analysis inquires into the effectiveness of seven different evaluation functions for the HP model. The degree of discrimination provided by each of the studied functions, their capability to preserve a rank ordering among potential solutions which is consistent with the original objective of the HP model, as well as their effect on the performance of local search methods are analyzed. The obtained results indicate that studying alternative evaluation schemes for the HP model represents a highly valuable direction which merits more attention.
基金financially supported by the National Natural Science Foundations of China (Nos. 41831288 and51672257)the Fundamental Research Funds for the Central Universities (Nos. 2652018305 and 2652017335)+3 种基金Guangdong Innovation Research Team for Higher Education (No. 2017KCXTD030)the High-Level Talents Project of Dongguan University of Technology (No. KCYKYQD2017017)Engineering Research Center of None-food Biomass Efficient Pyrolysis and Utilization Technology of Guangdong Higher Education Institutes (No. 2016GCZX009)Russian Science Foundation (No. 19-77-10013)。
文摘Mineral apatite compounds have attracted significant interest due to their chemical stability and adjustable hexagonal structure,which makes them suitable as new photovoltaic functional materials.The band gap of natural apatite is ~5.45 eV,and such a large value limits their applications in the field of catalysis and energy devices.In this research,we designed a method to narrow the band gap via the tetrahedral substitution effect in apatite-based compounds.The density functional theory(DFT) and experimental investigation of the electronic and optical properties revealed that the continuous incorporation of [MO_(4)]^(4-) tetrahedrons(M=Si,Ge,Sn,and Mn) into the crystal lattice can significantly reduce the band gap.In particular,this phenomenon was observed when the[MnO_(4)]^(4-) tetrahedron replaces the [PO_(4)]^(4-) tetrahedron because of the formation of a Mn 3 d-derived conduction band minimum(CBM) and interacts with other elements,leading to band broadening and obvious reduction of the band gap.This approach allowed us to propose a novel scheme in the band gap engineering of apatite-based compounds toward an entire spectral range modification.