Superconductive properties for oxides were predicted by artificial neural network (ANN) method with structural and chemical parameters as inputs. The predicted properties include superconductivity for oxides, distribu...Superconductive properties for oxides were predicted by artificial neural network (ANN) method with structural and chemical parameters as inputs. The predicted properties include superconductivity for oxides, distributed ranges of the superconductive transition temperature (Tc) for complex oxides, and Tc values for cuprate superconductors. The calculated results indicated that the adjusted ANN can be used to predict superconductive properties for unknown oxides.展开更多
Density functional theory method has been employed to investigate the structures of the prototypical technetium-labeled diphosphonate complex 99mTc-MDP, where MDP represents methylenediphosphonic acid. A total of 14 t...Density functional theory method has been employed to investigate the structures of the prototypical technetium-labeled diphosphonate complex 99mTc-MDP, where MDP represents methylenediphosphonic acid. A total of 14 trial structures were generated by allowing for the geometric, conformational, charge, and spin isomerism. Based on the optimized structures and calculated energies at the B3LYP/LANL2DZ level, two stable isomers were determined for the title complex. And they were further studied systematically in comparison with the experimental structure. The basis sets 6-31G*(LANL2DZ for Tc), 6-31G*(cc-pVDZ-pp for Tc), and DGDZVP have also been employed in combination with the B3LYP functional to study the basis set effect on the geometries of isomers. The optimized structures agree well with the available experimental data, and the bond lengths are more sensitive to the basis set than the bond angles. The charge distributions were studied by the Mulliken population analysis and natural bond orbital analysis. The results reflect a significant ligand-to-metal electron donation.展开更多
High-precision and efficient structural response prediction is essential for intelligent disaster prevention and mitigation in building structures,including post-earthquake damage assessment,structural health monitori...High-precision and efficient structural response prediction is essential for intelligent disaster prevention and mitigation in building structures,including post-earthquake damage assessment,structural health monitoring,and seismic resilience assessment of buildings.To improve the accuracy and efficiency of structural response prediction,this study proposes a novel physics-informed deep-learning-based realtime structural response prediction method that can predict a large number of nodes in a structure through a data-driven training method and an autoregressive training strategy.The proposed method includes a Phy-Seisformer model that incorporates the physical information of the structure into the model,thereby enabling higher-precision predictions.Experiments were conducted on a four-story masonry structure,an eleven-story reinforced concrete irregular structure,and a twenty-one-story reinforced concrete frame structure to verify the accuracy and efficiency of the proposed method.In addition,the effectiveness of the structure in the Phy-Seisformer model was verified using an ablation study.Furthermore,by conducting a comparative experiment,the impact of the range of seismic wave amplitudes on the prediction accuracy was studied.The experimental results show that the method proposed in this paper can achieve very high accuracy and at least 5000 times faster calculation speed than finite element calculations for different types of building structures.展开更多
[Objective] To examine the grammar model based on lexical substring exac- tion for RNA secondary structure prediction. [Method] By introducing cloud model into stochastic grammar model, a machine learning algorithm su...[Objective] To examine the grammar model based on lexical substring exac- tion for RNA secondary structure prediction. [Method] By introducing cloud model into stochastic grammar model, a machine learning algorithm suitable for the lexicalized stochastic grammar model was proposed. The word grid mode was used to extract and divide RNA sequence to acquire lexical substring, and the cloud classifier was used to search the maximum probability of each lemma which was marked as a certain sec- ondary structure type. Then, the lemma information was introduced into the training stochastic grammar process as prior information, realizing the prediction on the sec- ondary structure of RNA, and the method was tested by experiment. [Result] The experimental results showed that the prediction accuracy and searching speed of stochastic grammar cloud model were significantly improved from the prediction with simple stochastic grammar. [Conclusion] This study laid the foundation for the wide application of stochastic grammar model for RNA secondary structure prediction.展开更多
Structure-stratigraphy analysis" is a new method used in the study and prediction of and small-scaled structures in coal mines. The object of this method is coalbed structure that includes the folds and fracture ...Structure-stratigraphy analysis" is a new method used in the study and prediction of and small-scaled structures in coal mines. The object of this method is coalbed structure that includes the folds and fracture occurred in the vicinity of coal-seams. It emphases the analysis on the relationship between structural deformation and stratal lithologic combination.Based on the statistics of a series of related parameters in stratigraphy and structure,comprehensive analysis and drawing, this method may provide a good means for the quantitative evaluation and prediction of small scale structure in coal mines.展开更多
Membrane proteins are an important kind of proteins embedded in the membranes of cells and play crucial roles in living organisms, such as ion channels,transporters, receptors. Because it is difficult to determinate t...Membrane proteins are an important kind of proteins embedded in the membranes of cells and play crucial roles in living organisms, such as ion channels,transporters, receptors. Because it is difficult to determinate the membrane protein's structure by wet-lab experiments,accurate and fast amino acid sequence-based computational methods are highly desired. In this paper, we report an online prediction tool called Mem Brain, whose input is the amino acid sequence. Mem Brain consists of specialized modules for predicting transmembrane helices, residue–residue contacts and relative accessible surface area of a-helical membrane proteins. Mem Brain achieves aprediction accuracy of 97.9% of ATMH, 87.1% of AP,3.2 ± 3.0 of N-score, 3.1 ± 2.8 of C-score. Mem BrainContact obtains 62%/64.1% prediction accuracy on training and independent dataset on top L/5 contact prediction,respectively. And Mem Brain-Rasa achieves Pearson correlation coefficient of 0.733 and its mean absolute error of13.593. These prediction results provide valuable hints for revealing the structure and function of membrane proteins.Mem Brain web server is free for academic use and available at www.csbio.sjtu.edu.cn/bioinf/Mem Brain/.展开更多
RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, an...RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, and facilitating the design of new RNAs. Machine learning (ML) techniques have made tremendous progress in many fields in the past few years. Although their usage in protein-related fields has a long history, the use of ML methods in predicting RNA tertiary structures is new and rare. Here, we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation, the difficulties and potentials of these approaches when applied in the field.展开更多
Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some st...Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some structure prediction models have been developed in recent years. In this review, the progress in computational models for RNA structure prediction is introduced and the distinguishing features of many outstanding algorithms are discussed, emphasizing three- dimensional (3D) structure prediction. A promising coarse-grained model for predicting RNA 3D structure, stability and salt effect is also introduced briefly. Finally, we discuss the major challenges in the RNA 3D structure modeling.展开更多
The algorithm based on combination learning usually is superior to a singleclassification algorithm on the task of protein secondary structure prediction. However,the assignment of the weight of the base classifier us...The algorithm based on combination learning usually is superior to a singleclassification algorithm on the task of protein secondary structure prediction. However,the assignment of the weight of the base classifier usually lacks decision-makingevidence. In this paper, we propose a protein secondary structure prediction method withdynamic self-adaptation combination strategy based on entropy, where the weights areassigned according to the entropy of posterior probabilities outputted by base classifiers.The higher entropy value means a lower weight for the base classifier. The final structureprediction is decided by the weighted combination of posterior probabilities. Extensiveexperiments on CB513 dataset demonstrates that the proposed method outperforms theexisting methods, which can effectively improve the prediction performance.展开更多
Based on the concept of ant colony optimization and the idea of population in genetic algorithm, a novel global optimization algorithm, called the hybrid ant colony optimization (HACO), is proposed in this paper to ...Based on the concept of ant colony optimization and the idea of population in genetic algorithm, a novel global optimization algorithm, called the hybrid ant colony optimization (HACO), is proposed in this paper to tackle continuous-space optimization problems. It was compared with other well-known stochastic methods in the optimization of the benchmark functions and was also used to solve the problem of selecting appropriate dilation efficiently by optimizing the wavelet power spectrum of the hydrophobic sequence of protein, which is the key step on using continuous wavelet transform (CWT) to predict a-helices and connecting peptides.展开更多
Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure ...Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure prediction(PSSP)has a significant role in the prediction of protein tertiary structure,as it bridges the gap between the protein primary sequences and tertiary structure prediction.Protein secondary structures are classified into two categories:3-state category and 8-state category.Predicting the 3 states and the 8 states of secondary structures from protein sequences are called the Q3 prediction and the Q8 prediction problems,respectively.The 8 classes of secondary structures reveal more precise structural information for a variety of applications than the 3 classes of secondary structures,however,Q8 prediction has been found to be very challenging,that is why all previous work done in PSSP have focused on Q3 prediction.In this paper,we develop an ensemble Machine Learning(ML)approach for Q8 PSSP to explore the performance of ensemble learning algorithms compared to that of individual ML algorithms in Q8 PSSP.The ensemble members considered for constructing the ensemble models are well known classifiers,namely SVM(Support Vector Machines),KNN(K-Nearest Neighbor),DT(Decision Tree),RF(Random Forest),and NB(Naïve Bayes),with two feature extraction techniques,namely LDA(Linear Discriminate Analysis)and PCA(Principal Component Analysis).Experiments have been conducted for evaluating the performance of single models and ensemble models,with PCA and LDA,in Q8 PSSP.The novelty of this paper lies in the introduction of ensemble learning in Q8 PSSP problem.The experimental results confirmed that ensemble ML models are more accurate than individual ML models.They also indicated that features extracted by LDA are more effective than those extracted by PCA.展开更多
The secondary structure of a protein is critical for establishing a link between the protein primary and tertiary structures.For this reason,it is important to design methods for accurate protein secondary structure p...The secondary structure of a protein is critical for establishing a link between the protein primary and tertiary structures.For this reason,it is important to design methods for accurate protein secondary structure prediction.Most of the existing computational techniques for protein structural and functional prediction are based onmachine learning with shallowframeworks.Different deep learning architectures have already been applied to tackle protein secondary structure prediction problem.In this study,deep learning based models,i.e.,convolutional neural network and long short-term memory for protein secondary structure prediction were proposed.The input to proposed models is amino acid sequences which were derived from CulledPDB dataset.Hyperparameter tuning with cross validation was employed to attain best parameters for the proposed models.The proposed models enables effective processing of amino acids and attain approximately 87.05%and 87.47%Q3 accuracy of protein secondary structure prediction for convolutional neural network and long short-term memory models,respectively.展开更多
The structure type for the crystal of 4,4'-bis-(2-hydroxy-ethoxyl)-biphenyl 1 has been predicted by using the previously developed interfacial model for small organic molecules. Based on the calculated hydrophobic...The structure type for the crystal of 4,4'-bis-(2-hydroxy-ethoxyl)-biphenyl 1 has been predicted by using the previously developed interfacial model for small organic molecules. Based on the calculated hydrophobic to hydrophilic volume of 1, this model predicts the crystal structure to be of lamellar or bicontinuous type, which has been confirmed by the X-ray single-crystal structure analysis (C20H26O6, monoclinic, P21/C, a = 16.084(1), b = 6.0103(4), c = 9.6410(7) A, β9 = 103.014(2)°, V= 908.1(1) A3, Z = 2, Dc= 1.325 g/cm3, F(000)=388,μ = 0.097 mm-1, MoKα radiation, λ = 0.71073 A, R = 0.0382 and wR = 0.0882 with I > 2σ(I) for 7121 reflections collected, 1852 unique reflections and 170 parameters). As predicted, the hydrophobic and hydrophilic portions of 1 form in the lamellae. The same interfacial model is applied to other amphilphilic small molecule organic systems for structural type prediction.展开更多
Secondary structures of RNAs are the basis of understanding their tertiary structures and functions and so their predictions are widely needed due to increasing discovery of noncoding RNAs.In the last decades,a lot of...Secondary structures of RNAs are the basis of understanding their tertiary structures and functions and so their predictions are widely needed due to increasing discovery of noncoding RNAs.In the last decades,a lot of methods have been proposed to predict RNA secondary structures but their accuracies encountered bottleneck.Here we present a method for RNA secondary structure prediction using direct coupling analysis and a remove-and-expand algorithm that shows better performance than four existing popular multiple-sequence methods.We further show that the results can also be used to improve the prediction accuracy of the single-sequence methods.展开更多
In this paper, the applications of evolutionary algorithm in prediction of protein secondary structure and tertiary structures are introduced, and recent studies on solving protein structure prediction problems using ...In this paper, the applications of evolutionary algorithm in prediction of protein secondary structure and tertiary structures are introduced, and recent studies on solving protein structure prediction problems using evolutionary algorithms are reviewed, and the challenges and prospects of EAs applied to protein structure modeling are analyzed and discussed.展开更多
A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudokno...A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudoknots including the well known H-type pseudoknot were permitted to occur if reasonable.We have applied this approach to e number of RNA sequences.The prediction accuracies we obtained were higher than those in published papers.展开更多
The architecture of a BioAccel (internal code) chip for RNA secondary structure prediction is described in the letter. The system is based on a BioBus (internal code), whose distinguishing features are: Two separated ...The architecture of a BioAccel (internal code) chip for RNA secondary structure prediction is described in the letter. The system is based on a BioBus (internal code), whose distinguishing features are: Two separated control and data channels, and a slave-associated arbitration scheme. Two reference systems based on the AMBA AHB bus and Coreconnect bus are introduced to evaluate the performance of the system. The simulation results are attractive. The average communication bandwidth of the chip is increased at severalfold, and the read and write latencies are reduced about 40 percent.展开更多
The hydrophobic-polar (HP) lattice model is an important simplified model for studying protein folding. In this paper, we present an improved ACO algorithm for the protein structure prediction. In the algorithm, the &...The hydrophobic-polar (HP) lattice model is an important simplified model for studying protein folding. In this paper, we present an improved ACO algorithm for the protein structure prediction. In the algorithm, the "lone"ethod is applied to deal with the infeasible structures, and the "oint mutation and reconstruction"ethod is applied in local search phase. The empirical results show that the presented method is feasible and effective to solve the problem of protein structure prediction, and notable improvements in CPU time are obtained.展开更多
A three-dimensional off-lattice protein model with two species of monomers, hydrophobic and hydrophilic, is studied. Enligh- tened by the law of reciprocity among things in the physical world, a heuristic quasi-physic...A three-dimensional off-lattice protein model with two species of monomers, hydrophobic and hydrophilic, is studied. Enligh- tened by the law of reciprocity among things in the physical world, a heuristic quasi-physical algorithm for protein structure prediction problem is put forward. First, by elaborately simulating the movement of the smooth elastic balls in the physical world, the algorithm finds low energy configurations for a given monomer chain. An "off-trap" strategy is then proposed to get out of local minima. Experimental results show promising performance. For all chains with lengths 13≤n ≤55, the proposed algorithm finds states with lower energy than the putative ground states reported in literatures. Furthermore, for chain lengths n = 21, 34, and 55, the algorithm finds new low energy configurations different from those given in literatures.展开更多
In this paper, on the basis of the heat conduction equation without consideration of the advection and turbulence effects, one-dimensional model for describing surface sea temperature ( T1), bottom sea temperature ( T...In this paper, on the basis of the heat conduction equation without consideration of the advection and turbulence effects, one-dimensional model for describing surface sea temperature ( T1), bottom sea temperature ( Tt ) and the thickness of the upper homogeneous layer ( h ) is developed in terms of the dimensionless temperature θT and depth η and self-simulation function θT - f(η) of vertical temperature profile by means of historical temperature data.The results of trial prediction with our one-dimensional model on T, Th, h , the thickness and gradient of thermocline are satisfactory to some extent.展开更多
文摘Superconductive properties for oxides were predicted by artificial neural network (ANN) method with structural and chemical parameters as inputs. The predicted properties include superconductivity for oxides, distributed ranges of the superconductive transition temperature (Tc) for complex oxides, and Tc values for cuprate superconductors. The calculated results indicated that the adjusted ANN can be used to predict superconductive properties for unknown oxides.
基金This work was supported by the National Natural Science Foundation of China (No.20801024 and No.21001055), the Natural Science Foundation of Jiangsu Province (No.BK2009077), and the Science Foundation of Health Department of Jiangsu Province (No.H200963).
文摘Density functional theory method has been employed to investigate the structures of the prototypical technetium-labeled diphosphonate complex 99mTc-MDP, where MDP represents methylenediphosphonic acid. A total of 14 trial structures were generated by allowing for the geometric, conformational, charge, and spin isomerism. Based on the optimized structures and calculated energies at the B3LYP/LANL2DZ level, two stable isomers were determined for the title complex. And they were further studied systematically in comparison with the experimental structure. The basis sets 6-31G*(LANL2DZ for Tc), 6-31G*(cc-pVDZ-pp for Tc), and DGDZVP have also been employed in combination with the B3LYP functional to study the basis set effect on the geometries of isomers. The optimized structures agree well with the available experimental data, and the bond lengths are more sensitive to the basis set than the bond angles. The charge distributions were studied by the Mulliken population analysis and natural bond orbital analysis. The results reflect a significant ligand-to-metal electron donation.
基金support from the National Natural Science Foundation of China(52025083 and U2139209)XPLORER PRIZE of New Cornerstone Science Foundation,the Shanghai Social Development Science and Technology Research Project(22dz1201400)the Shanghai Urban Digital Transformation Special Fund(202201033).
文摘High-precision and efficient structural response prediction is essential for intelligent disaster prevention and mitigation in building structures,including post-earthquake damage assessment,structural health monitoring,and seismic resilience assessment of buildings.To improve the accuracy and efficiency of structural response prediction,this study proposes a novel physics-informed deep-learning-based realtime structural response prediction method that can predict a large number of nodes in a structure through a data-driven training method and an autoregressive training strategy.The proposed method includes a Phy-Seisformer model that incorporates the physical information of the structure into the model,thereby enabling higher-precision predictions.Experiments were conducted on a four-story masonry structure,an eleven-story reinforced concrete irregular structure,and a twenty-one-story reinforced concrete frame structure to verify the accuracy and efficiency of the proposed method.In addition,the effectiveness of the structure in the Phy-Seisformer model was verified using an ablation study.Furthermore,by conducting a comparative experiment,the impact of the range of seismic wave amplitudes on the prediction accuracy was studied.The experimental results show that the method proposed in this paper can achieve very high accuracy and at least 5000 times faster calculation speed than finite element calculations for different types of building structures.
基金Supported by the Science Foundation of Hengyang Normal University of China(09A36)~~
文摘[Objective] To examine the grammar model based on lexical substring exac- tion for RNA secondary structure prediction. [Method] By introducing cloud model into stochastic grammar model, a machine learning algorithm suitable for the lexicalized stochastic grammar model was proposed. The word grid mode was used to extract and divide RNA sequence to acquire lexical substring, and the cloud classifier was used to search the maximum probability of each lemma which was marked as a certain sec- ondary structure type. Then, the lemma information was introduced into the training stochastic grammar process as prior information, realizing the prediction on the sec- ondary structure of RNA, and the method was tested by experiment. [Result] The experimental results showed that the prediction accuracy and searching speed of stochastic grammar cloud model were significantly improved from the prediction with simple stochastic grammar. [Conclusion] This study laid the foundation for the wide application of stochastic grammar model for RNA secondary structure prediction.
文摘Structure-stratigraphy analysis" is a new method used in the study and prediction of and small-scaled structures in coal mines. The object of this method is coalbed structure that includes the folds and fracture occurred in the vicinity of coal-seams. It emphases the analysis on the relationship between structural deformation and stratal lithologic combination.Based on the statistics of a series of related parameters in stratigraphy and structure,comprehensive analysis and drawing, this method may provide a good means for the quantitative evaluation and prediction of small scale structure in coal mines.
基金supported by the National Natural Science Foundation of China(Nos.61671288,91530321,61603161)Science and Technology Commission of Shanghai Municipality(Nos.16JC1404300,17JC1403500,16ZR1448700)
文摘Membrane proteins are an important kind of proteins embedded in the membranes of cells and play crucial roles in living organisms, such as ion channels,transporters, receptors. Because it is difficult to determinate the membrane protein's structure by wet-lab experiments,accurate and fast amino acid sequence-based computational methods are highly desired. In this paper, we report an online prediction tool called Mem Brain, whose input is the amino acid sequence. Mem Brain consists of specialized modules for predicting transmembrane helices, residue–residue contacts and relative accessible surface area of a-helical membrane proteins. Mem Brain achieves aprediction accuracy of 97.9% of ATMH, 87.1% of AP,3.2 ± 3.0 of N-score, 3.1 ± 2.8 of C-score. Mem BrainContact obtains 62%/64.1% prediction accuracy on training and independent dataset on top L/5 contact prediction,respectively. And Mem Brain-Rasa achieves Pearson correlation coefficient of 0.733 and its mean absolute error of13.593. These prediction results provide valuable hints for revealing the structure and function of membrane proteins.Mem Brain web server is free for academic use and available at www.csbio.sjtu.edu.cn/bioinf/Mem Brain/.
基金Project supported by the National Natural Science Foundation of China (Grant Nos. 11774158, 11974173, 11774157, and 11934008)。
文摘RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, and facilitating the design of new RNAs. Machine learning (ML) techniques have made tremendous progress in many fields in the past few years. Although their usage in protein-related fields has a long history, the use of ML methods in predicting RNA tertiary structures is new and rare. Here, we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation, the difficulties and potentials of these approaches when applied in the field.
基金supported by the National Natural Science Foundation of China(Grant Nos.11074191,11175132,and 11374234)the National Basic Research Programof China(Grant No.2011CB933600)the Program for New Century Excellent Talents of China(Grant No.NCET 08-0408)
文摘Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some structure prediction models have been developed in recent years. In this review, the progress in computational models for RNA structure prediction is introduced and the distinguishing features of many outstanding algorithms are discussed, emphasizing three- dimensional (3D) structure prediction. A promising coarse-grained model for predicting RNA 3D structure, stability and salt effect is also introduced briefly. Finally, we discuss the major challenges in the RNA 3D structure modeling.
文摘The algorithm based on combination learning usually is superior to a singleclassification algorithm on the task of protein secondary structure prediction. However,the assignment of the weight of the base classifier usually lacks decision-makingevidence. In this paper, we propose a protein secondary structure prediction method withdynamic self-adaptation combination strategy based on entropy, where the weights areassigned according to the entropy of posterior probabilities outputted by base classifiers.The higher entropy value means a lower weight for the base classifier. The final structureprediction is decided by the weighted combination of posterior probabilities. Extensiveexperiments on CB513 dataset demonstrates that the proposed method outperforms theexisting methods, which can effectively improve the prediction performance.
基金the National Natural Science Foundation of China(No.20475068) the Guangdong Provincial Natural Science Foundation(No.031577).
文摘Based on the concept of ant colony optimization and the idea of population in genetic algorithm, a novel global optimization algorithm, called the hybrid ant colony optimization (HACO), is proposed in this paper to tackle continuous-space optimization problems. It was compared with other well-known stochastic methods in the optimization of the benchmark functions and was also used to solve the problem of selecting appropriate dilation efficiently by optimizing the wavelet power spectrum of the hydrophobic sequence of protein, which is the key step on using continuous wavelet transform (CWT) to predict a-helices and connecting peptides.
文摘Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure prediction(PSSP)has a significant role in the prediction of protein tertiary structure,as it bridges the gap between the protein primary sequences and tertiary structure prediction.Protein secondary structures are classified into two categories:3-state category and 8-state category.Predicting the 3 states and the 8 states of secondary structures from protein sequences are called the Q3 prediction and the Q8 prediction problems,respectively.The 8 classes of secondary structures reveal more precise structural information for a variety of applications than the 3 classes of secondary structures,however,Q8 prediction has been found to be very challenging,that is why all previous work done in PSSP have focused on Q3 prediction.In this paper,we develop an ensemble Machine Learning(ML)approach for Q8 PSSP to explore the performance of ensemble learning algorithms compared to that of individual ML algorithms in Q8 PSSP.The ensemble members considered for constructing the ensemble models are well known classifiers,namely SVM(Support Vector Machines),KNN(K-Nearest Neighbor),DT(Decision Tree),RF(Random Forest),and NB(Naïve Bayes),with two feature extraction techniques,namely LDA(Linear Discriminate Analysis)and PCA(Principal Component Analysis).Experiments have been conducted for evaluating the performance of single models and ensemble models,with PCA and LDA,in Q8 PSSP.The novelty of this paper lies in the introduction of ensemble learning in Q8 PSSP problem.The experimental results confirmed that ensemble ML models are more accurate than individual ML models.They also indicated that features extracted by LDA are more effective than those extracted by PCA.
文摘The secondary structure of a protein is critical for establishing a link between the protein primary and tertiary structures.For this reason,it is important to design methods for accurate protein secondary structure prediction.Most of the existing computational techniques for protein structural and functional prediction are based onmachine learning with shallowframeworks.Different deep learning architectures have already been applied to tackle protein secondary structure prediction problem.In this study,deep learning based models,i.e.,convolutional neural network and long short-term memory for protein secondary structure prediction were proposed.The input to proposed models is amino acid sequences which were derived from CulledPDB dataset.Hyperparameter tuning with cross validation was employed to attain best parameters for the proposed models.The proposed models enables effective processing of amino acids and attain approximately 87.05%and 87.47%Q3 accuracy of protein secondary structure prediction for convolutional neural network and long short-term memory models,respectively.
基金This work was supported by the National Science Foundation(Grant DMR-9812351)
文摘The structure type for the crystal of 4,4'-bis-(2-hydroxy-ethoxyl)-biphenyl 1 has been predicted by using the previously developed interfacial model for small organic molecules. Based on the calculated hydrophobic to hydrophilic volume of 1, this model predicts the crystal structure to be of lamellar or bicontinuous type, which has been confirmed by the X-ray single-crystal structure analysis (C20H26O6, monoclinic, P21/C, a = 16.084(1), b = 6.0103(4), c = 9.6410(7) A, β9 = 103.014(2)°, V= 908.1(1) A3, Z = 2, Dc= 1.325 g/cm3, F(000)=388,μ = 0.097 mm-1, MoKα radiation, λ = 0.71073 A, R = 0.0382 and wR = 0.0882 with I > 2σ(I) for 7121 reflections collected, 1852 unique reflections and 170 parameters). As predicted, the hydrophobic and hydrophilic portions of 1 form in the lamellae. The same interfacial model is applied to other amphilphilic small molecule organic systems for structural type prediction.
基金Project supported by the National Natural Science Foundation of China(Grant No.31570722).
文摘Secondary structures of RNAs are the basis of understanding their tertiary structures and functions and so their predictions are widely needed due to increasing discovery of noncoding RNAs.In the last decades,a lot of methods have been proposed to predict RNA secondary structures but their accuracies encountered bottleneck.Here we present a method for RNA secondary structure prediction using direct coupling analysis and a remove-and-expand algorithm that shows better performance than four existing popular multiple-sequence methods.We further show that the results can also be used to improve the prediction accuracy of the single-sequence methods.
基金Supported by the National Natural Science Foundation of China(60133010,70071042,60073043)
文摘In this paper, the applications of evolutionary algorithm in prediction of protein secondary structure and tertiary structures are introduced, and recent studies on solving protein structure prediction problems using evolutionary algorithms are reviewed, and the challenges and prospects of EAs applied to protein structure modeling are analyzed and discussed.
文摘A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudoknots including the well known H-type pseudoknot were permitted to occur if reasonable.We have applied this approach to e number of RNA sequences.The prediction accuracies we obtained were higher than those in published papers.
基金Supported by the National Natrual Science Foundation of China (No.60373044) and Knowl-edge Innovative Project of CAS (No.KSCX2-SW-233).
文摘The architecture of a BioAccel (internal code) chip for RNA secondary structure prediction is described in the letter. The system is based on a BioBus (internal code), whose distinguishing features are: Two separated control and data channels, and a slave-associated arbitration scheme. Two reference systems based on the AMBA AHB bus and Coreconnect bus are introduced to evaluate the performance of the system. The simulation results are attractive. The average communication bandwidth of the chip is increased at severalfold, and the read and write latencies are reduced about 40 percent.
文摘The hydrophobic-polar (HP) lattice model is an important simplified model for studying protein folding. In this paper, we present an improved ACO algorithm for the protein structure prediction. In the algorithm, the "lone"ethod is applied to deal with the infeasible structures, and the "oint mutation and reconstruction"ethod is applied in local search phase. The empirical results show that the presented method is feasible and effective to solve the problem of protein structure prediction, and notable improvements in CPU time are obtained.
基金The National Natural Science Founda-tion of China (No.10471051) and the National Basic Research Program (973) of China (No.2004CB318000)
文摘A three-dimensional off-lattice protein model with two species of monomers, hydrophobic and hydrophilic, is studied. Enligh- tened by the law of reciprocity among things in the physical world, a heuristic quasi-physical algorithm for protein structure prediction problem is put forward. First, by elaborately simulating the movement of the smooth elastic balls in the physical world, the algorithm finds low energy configurations for a given monomer chain. An "off-trap" strategy is then proposed to get out of local minima. Experimental results show promising performance. For all chains with lengths 13≤n ≤55, the proposed algorithm finds states with lower energy than the putative ground states reported in literatures. Furthermore, for chain lengths n = 21, 34, and 55, the algorithm finds new low energy configurations different from those given in literatures.
文摘In this paper, on the basis of the heat conduction equation without consideration of the advection and turbulence effects, one-dimensional model for describing surface sea temperature ( T1), bottom sea temperature ( Tt ) and the thickness of the upper homogeneous layer ( h ) is developed in terms of the dimensionless temperature θT and depth η and self-simulation function θT - f(η) of vertical temperature profile by means of historical temperature data.The results of trial prediction with our one-dimensional model on T, Th, h , the thickness and gradient of thermocline are satisfactory to some extent.