In this paper, the applications of evolutionary algorithm in prediction of protein secondary structure and tertiary structures are introduced, and recent studies on solving protein structure prediction problems using ...In this paper, the applications of evolutionary algorithm in prediction of protein secondary structure and tertiary structures are introduced, and recent studies on solving protein structure prediction problems using evolutionary algorithms are reviewed, and the challenges and prospects of EAs applied to protein structure modeling are analyzed and discussed.展开更多
The hydrophobic-polar (HP) lattice model is an important simplified model for studying protein folding. In this paper, we present an improved ACO algorithm for the protein structure prediction. In the algorithm, the &...The hydrophobic-polar (HP) lattice model is an important simplified model for studying protein folding. In this paper, we present an improved ACO algorithm for the protein structure prediction. In the algorithm, the "lone"ethod is applied to deal with the infeasible structures, and the "oint mutation and reconstruction"ethod is applied in local search phase. The empirical results show that the presented method is feasible and effective to solve the problem of protein structure prediction, and notable improvements in CPU time are obtained.展开更多
A three-dimensional off-lattice protein model with two species of monomers, hydrophobic and hydrophilic, is studied. Enligh- tened by the law of reciprocity among things in the physical world, a heuristic quasi-physic...A three-dimensional off-lattice protein model with two species of monomers, hydrophobic and hydrophilic, is studied. Enligh- tened by the law of reciprocity among things in the physical world, a heuristic quasi-physical algorithm for protein structure prediction problem is put forward. First, by elaborately simulating the movement of the smooth elastic balls in the physical world, the algorithm finds low energy configurations for a given monomer chain. An "off-trap" strategy is then proposed to get out of local minima. Experimental results show promising performance. For all chains with lengths 13≤n ≤55, the proposed algorithm finds states with lower energy than the putative ground states reported in literatures. Furthermore, for chain lengths n = 21, 34, and 55, the algorithm finds new low energy configurations different from those given in literatures.展开更多
Protein structure prediction is an interdisciplinary research topic that has attracted researchers from multiple fields,including biochemistry,medicine,physics,mathematics,and computer science.These researchers adopt ...Protein structure prediction is an interdisciplinary research topic that has attracted researchers from multiple fields,including biochemistry,medicine,physics,mathematics,and computer science.These researchers adopt various research paradigms to attack the same structure prediction problem:biochemists and physicists attempt to reveal the principles governing protein folding;mathematicians,especially statisticians,usually start from assuming a probability distribution of protein structures given a target sequence and then find the most likely structure,while computer scientists formulate protein structure prediction as an optimization problem-finding the structural conformation with the lowest energy or minimizing the difference between predicted structure and native structure.These research paradigms fall into the two statistical modeling cultures proposed by Leo Breiman,namely,data modeling and algorithmic modeling.Recently,we have also witnessed the great success of deep learning in protein structure prediction.In this review,we present a survey of the efforts for protein structure prediction.We compare the research paradigms adopted by researchers from different fields,with an emphasis on the shift of research paradigms in the era of deep learning.In short,the algorithmic modeling techniques,especially deep neural networks,have considerably improved the accuracy of protein structure prediction;however,theories interpreting the neural networks and knowledge on protein folding are still highly desired.展开更多
The HP model for protein structure prediction abstracts the fact that hydrophobicity is a dominant force in the protein folding process. This challenging combinatorial optimization problem has been widely addressed th...The HP model for protein structure prediction abstracts the fact that hydrophobicity is a dominant force in the protein folding process. This challenging combinatorial optimization problem has been widely addressed through metaheuristics. The evaluation function is a key component for the success of metaheuristics; the poor discrimination of the conventional evaluation function of the HP model has motivated the proposal of alternative formulations for this component. This comparative analysis inquires into the effectiveness of seven different evaluation functions for the HP model. The degree of discrimination provided by each of the studied functions, their capability to preserve a rank ordering among potential solutions which is consistent with the original objective of the HP model, as well as their effect on the performance of local search methods are analyzed. The obtained results indicate that studying alternative evaluation schemes for the HP model represents a highly valuable direction which merits more attention.展开更多
Based on the concept of ant colony optimization and the idea of population in genetic algorithm, a novel global optimization algorithm, called the hybrid ant colony optimization (HACO), is proposed in this paper to ...Based on the concept of ant colony optimization and the idea of population in genetic algorithm, a novel global optimization algorithm, called the hybrid ant colony optimization (HACO), is proposed in this paper to tackle continuous-space optimization problems. It was compared with other well-known stochastic methods in the optimization of the benchmark functions and was also used to solve the problem of selecting appropriate dilation efficiently by optimizing the wavelet power spectrum of the hydrophobic sequence of protein, which is the key step on using continuous wavelet transform (CWT) to predict a-helices and connecting peptides.展开更多
The algorithm based on combination learning usually is superior to a singleclassification algorithm on the task of protein secondary structure prediction. However,the assignment of the weight of the base classifier us...The algorithm based on combination learning usually is superior to a singleclassification algorithm on the task of protein secondary structure prediction. However,the assignment of the weight of the base classifier usually lacks decision-makingevidence. In this paper, we propose a protein secondary structure prediction method withdynamic self-adaptation combination strategy based on entropy, where the weights areassigned according to the entropy of posterior probabilities outputted by base classifiers.The higher entropy value means a lower weight for the base classifier. The final structureprediction is decided by the weighted combination of posterior probabilities. Extensiveexperiments on CB513 dataset demonstrates that the proposed method outperforms theexisting methods, which can effectively improve the prediction performance.展开更多
Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure ...Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure prediction(PSSP)has a significant role in the prediction of protein tertiary structure,as it bridges the gap between the protein primary sequences and tertiary structure prediction.Protein secondary structures are classified into two categories:3-state category and 8-state category.Predicting the 3 states and the 8 states of secondary structures from protein sequences are called the Q3 prediction and the Q8 prediction problems,respectively.The 8 classes of secondary structures reveal more precise structural information for a variety of applications than the 3 classes of secondary structures,however,Q8 prediction has been found to be very challenging,that is why all previous work done in PSSP have focused on Q3 prediction.In this paper,we develop an ensemble Machine Learning(ML)approach for Q8 PSSP to explore the performance of ensemble learning algorithms compared to that of individual ML algorithms in Q8 PSSP.The ensemble members considered for constructing the ensemble models are well known classifiers,namely SVM(Support Vector Machines),KNN(K-Nearest Neighbor),DT(Decision Tree),RF(Random Forest),and NB(Naïve Bayes),with two feature extraction techniques,namely LDA(Linear Discriminate Analysis)and PCA(Principal Component Analysis).Experiments have been conducted for evaluating the performance of single models and ensemble models,with PCA and LDA,in Q8 PSSP.The novelty of this paper lies in the introduction of ensemble learning in Q8 PSSP problem.The experimental results confirmed that ensemble ML models are more accurate than individual ML models.They also indicated that features extracted by LDA are more effective than those extracted by PCA.展开更多
The secondary structure of a protein is critical for establishing a link between the protein primary and tertiary structures.For this reason,it is important to design methods for accurate protein secondary structure p...The secondary structure of a protein is critical for establishing a link between the protein primary and tertiary structures.For this reason,it is important to design methods for accurate protein secondary structure prediction.Most of the existing computational techniques for protein structural and functional prediction are based onmachine learning with shallowframeworks.Different deep learning architectures have already been applied to tackle protein secondary structure prediction problem.In this study,deep learning based models,i.e.,convolutional neural network and long short-term memory for protein secondary structure prediction were proposed.The input to proposed models is amino acid sequences which were derived from CulledPDB dataset.Hyperparameter tuning with cross validation was employed to attain best parameters for the proposed models.The proposed models enables effective processing of amino acids and attain approximately 87.05%and 87.47%Q3 accuracy of protein secondary structure prediction for convolutional neural network and long short-term memory models,respectively.展开更多
The deep-learning protein structure prediction method AlphaFold2 has garnered enormous attention beyond the realm of structural biology,for its groundbreaking contribution to solving the"protein foiding problem&q...The deep-learning protein structure prediction method AlphaFold2 has garnered enormous attention beyond the realm of structural biology,for its groundbreaking contribution to solving the"protein foiding problem"In this perspective,we explore the connection between protein structure studies and environmental research,delving into the potential for addressing specific environmental challenges.Proteins are promising for environmental applications because of the functional diversity endowed by their structural complexity.However,structural studies on proteins with environmental significance remain scarce.Here,we present the opportunity to study proteins by advancing experimental determination and deep-learning prediction methods.Specifically,the latest progress in environmental research via cryogenic electron microscopy is highlighted.It allows us to determine the structure of protein complexes in their native state within cells at molecular resolution,revealing environmentally-associated structural dynamics.With the remarkable advancements in computational power and experimental resolution,the study of protein structure and dynamics has reached unprecedented depth and accuracy.These advancements will undoubtedly accelerate the establishment of comprehensive environmental protein structural and functional databases.Tremendous opportunities for protein engineering exist to enable innovative solutions for environmental applications,such as the degradation of persistent contaminants,and the recovery of valuable metals as well as rare earth elements.展开更多
The folding dynamics and structural characteristics of peptides RTKAWNRQLYPEW (P1) and RTKQLYPEW (P2) are investigated by using all-atomic simulation procedure CHARMM in this work. The results show that P1, a segm...The folding dynamics and structural characteristics of peptides RTKAWNRQLYPEW (P1) and RTKQLYPEW (P2) are investigated by using all-atomic simulation procedure CHARMM in this work. The results show that P1, a segment of an antigen, has a folding motif of α-helix, whereas P2, which is derived by deleting four residues AWNR from peptide P1, prevents the formation of helix and presents a β-strand. And peptlde P1 experiences a more rugged energy landscape than peptide P2. From our results, it is inferred that the antibody CD8 cytolytic T lymphocyte prefers an antigen with a β-folding structure to that with an α-helical one.展开更多
It has been well accepted that the folding energy landscape may resemble a funnel according to the theory of protein folding. This theory of "folding funnel" has been extensively studied and thought to play an impor...It has been well accepted that the folding energy landscape may resemble a funnel according to the theory of protein folding. This theory of "folding funnel" has been extensively studied and thought to play an important role in guiding the sampling process of the protein folding and refinement in protein structure prediction. Here, we have investigated the relationship between the "funnel likeness" of protein folding and the size/structure of the proteins based on a set of non-homologous proteins we have recently evaluated using a statistical mechanicsbased scoring function ITScorePro. It was found that larger proteins that consist of more helix/sheet structures tend to have a higher score-Root Mean Square Deviation(RMSD) correlation(or a more funnel like energy landscape).Another measurement in protein folding, Z-score, has also shown some correlation with the size of the proteins.As expected, proteins with a better "olding funnel likeness"(or score-RMSD correlation) tend to have a betterpredicted conformation with a lower RMSD from their native structures. These findings can be extremely valuable for the development and improvement of sampling and scoring algorithms for protein structure prediction.展开更多
Protein structure Quality Assessment(QA) is an essential component in protein structure prediction and analysis. The relationship between protein sequence and structure often serves as a basis for protein structure ...Protein structure Quality Assessment(QA) is an essential component in protein structure prediction and analysis. The relationship between protein sequence and structure often serves as a basis for protein structure QA.In this work, we developed a new Hidden Markov Model(HMM) to assess the compatibility of protein sequence and structure for capturing their complex relationship. More specifically, the emission of the HMM consists of protein local structures in angular space, secondary structures, and sequence profiles. This model has two capabilities:(1) encoding local structure of each position by jointly considering sequence and structure information, and(2)assigning a global score to estimate the overall quality of a predicted structure, as well as local scores to assess the quality of specific regions of a structure, which provides useful guidance for targeted structure refinement. We compared the HMM model to state-of-art single structure quality assessment methods OPUSCA, DFIRE, GOAP,and RW in protein structure selection. Computational results showed our new score HMM.Z can achieve better overall selection performance on the benchmark datasets.展开更多
The three-dimensional (3D) structure prediction of proteins :is an important task in bioinformatics. Finding energy functions that can better represent residue-residue and residue-solvent interactions is a crucial ...The three-dimensional (3D) structure prediction of proteins :is an important task in bioinformatics. Finding energy functions that can better represent residue-residue and residue-solvent interactions is a crucial way to improve the prediction accu- racy. The widely used contact energy functions mostly only consider the contact frequency between different types of residues; however, we find that the contact frequency also relates to the residue hydrophobic environment. Accordingly, we present an improved contact energy function to integrate the two factors, which can reflect the influence of hydrophobic interaction on the stabilization of protein 3D structure more effectively. Furthermore, a fold recognition (threading) approach based on this energy function is developed. The testing results obtained with 20 randomly selected proteins demonstrate that, compared with common contact energy functions, the proposed energy function can improve the accuracy of the fold template prediction from 20% to 50%, and can also improve the accuracy of the sequence-template alignment from 35% to 65%.展开更多
Protein structure prediction is one of the most important problems in structural biology, β-turns are always at the turn of a protein tertiary structure and thus β-turn's prediction is a key step in tertiary struct...Protein structure prediction is one of the most important problems in structural biology, β-turns are always at the turn of a protein tertiary structure and thus β-turn's prediction is a key step in tertiary structure prediction. There are some methods to predict β-turns based on machine learning techniques such as k-nearest method, neural networks and support vector machine. In this paper, we construct a classifier using double BP networks and put forward two novel methods to code amino acids in the second network. When trained and tested on different datasets, they achieve more accuracy than other coding methods.展开更多
With the rapid development of multiple technologies from the Internet to mobile phones and cameras, visual data is now widely available in huge amounts and great variety, bringing significant opportunities for novel p...With the rapid development of multiple technologies from the Internet to mobile phones and cameras, visual data is now widely available in huge amounts and great variety, bringing significant opportunities for novel processing of visual information as well as commercial applications.展开更多
基金Supported by the National Natural Science Foundation of China(60133010,70071042,60073043)
文摘In this paper, the applications of evolutionary algorithm in prediction of protein secondary structure and tertiary structures are introduced, and recent studies on solving protein structure prediction problems using evolutionary algorithms are reviewed, and the challenges and prospects of EAs applied to protein structure modeling are analyzed and discussed.
文摘The hydrophobic-polar (HP) lattice model is an important simplified model for studying protein folding. In this paper, we present an improved ACO algorithm for the protein structure prediction. In the algorithm, the "lone"ethod is applied to deal with the infeasible structures, and the "oint mutation and reconstruction"ethod is applied in local search phase. The empirical results show that the presented method is feasible and effective to solve the problem of protein structure prediction, and notable improvements in CPU time are obtained.
基金The National Natural Science Founda-tion of China (No.10471051) and the National Basic Research Program (973) of China (No.2004CB318000)
文摘A three-dimensional off-lattice protein model with two species of monomers, hydrophobic and hydrophilic, is studied. Enligh- tened by the law of reciprocity among things in the physical world, a heuristic quasi-physical algorithm for protein structure prediction problem is put forward. First, by elaborately simulating the movement of the smooth elastic balls in the physical world, the algorithm finds low energy configurations for a given monomer chain. An "off-trap" strategy is then proposed to get out of local minima. Experimental results show promising performance. For all chains with lengths 13≤n ≤55, the proposed algorithm finds states with lower energy than the putative ground states reported in literatures. Furthermore, for chain lengths n = 21, 34, and 55, the algorithm finds new low energy configurations different from those given in literatures.
基金the National Key R&D Program of China(Grant No.2020YFA0907000)lthe National Natural Science Foundation of China(Grant Nos.32271297,62072435,31770775,and 31671369)for providing financial support for this study and publication charges.
文摘Protein structure prediction is an interdisciplinary research topic that has attracted researchers from multiple fields,including biochemistry,medicine,physics,mathematics,and computer science.These researchers adopt various research paradigms to attack the same structure prediction problem:biochemists and physicists attempt to reveal the principles governing protein folding;mathematicians,especially statisticians,usually start from assuming a probability distribution of protein structures given a target sequence and then find the most likely structure,while computer scientists formulate protein structure prediction as an optimization problem-finding the structural conformation with the lowest energy or minimizing the difference between predicted structure and native structure.These research paradigms fall into the two statistical modeling cultures proposed by Leo Breiman,namely,data modeling and algorithmic modeling.Recently,we have also witnessed the great success of deep learning in protein structure prediction.In this review,we present a survey of the efforts for protein structure prediction.We compare the research paradigms adopted by researchers from different fields,with an emphasis on the shift of research paradigms in the era of deep learning.In short,the algorithmic modeling techniques,especially deep neural networks,have considerably improved the accuracy of protein structure prediction;however,theories interpreting the neural networks and knowledge on protein folding are still highly desired.
基金partially supported by the National Council of Science and Technology of México (CO NACyT) under Grant Nos. 105060 and 99276
文摘The HP model for protein structure prediction abstracts the fact that hydrophobicity is a dominant force in the protein folding process. This challenging combinatorial optimization problem has been widely addressed through metaheuristics. The evaluation function is a key component for the success of metaheuristics; the poor discrimination of the conventional evaluation function of the HP model has motivated the proposal of alternative formulations for this component. This comparative analysis inquires into the effectiveness of seven different evaluation functions for the HP model. The degree of discrimination provided by each of the studied functions, their capability to preserve a rank ordering among potential solutions which is consistent with the original objective of the HP model, as well as their effect on the performance of local search methods are analyzed. The obtained results indicate that studying alternative evaluation schemes for the HP model represents a highly valuable direction which merits more attention.
基金the National Natural Science Foundation of China(No.20475068) the Guangdong Provincial Natural Science Foundation(No.031577).
文摘Based on the concept of ant colony optimization and the idea of population in genetic algorithm, a novel global optimization algorithm, called the hybrid ant colony optimization (HACO), is proposed in this paper to tackle continuous-space optimization problems. It was compared with other well-known stochastic methods in the optimization of the benchmark functions and was also used to solve the problem of selecting appropriate dilation efficiently by optimizing the wavelet power spectrum of the hydrophobic sequence of protein, which is the key step on using continuous wavelet transform (CWT) to predict a-helices and connecting peptides.
文摘The algorithm based on combination learning usually is superior to a singleclassification algorithm on the task of protein secondary structure prediction. However,the assignment of the weight of the base classifier usually lacks decision-makingevidence. In this paper, we propose a protein secondary structure prediction method withdynamic self-adaptation combination strategy based on entropy, where the weights areassigned according to the entropy of posterior probabilities outputted by base classifiers.The higher entropy value means a lower weight for the base classifier. The final structureprediction is decided by the weighted combination of posterior probabilities. Extensiveexperiments on CB513 dataset demonstrates that the proposed method outperforms theexisting methods, which can effectively improve the prediction performance.
文摘Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure prediction(PSSP)has a significant role in the prediction of protein tertiary structure,as it bridges the gap between the protein primary sequences and tertiary structure prediction.Protein secondary structures are classified into two categories:3-state category and 8-state category.Predicting the 3 states and the 8 states of secondary structures from protein sequences are called the Q3 prediction and the Q8 prediction problems,respectively.The 8 classes of secondary structures reveal more precise structural information for a variety of applications than the 3 classes of secondary structures,however,Q8 prediction has been found to be very challenging,that is why all previous work done in PSSP have focused on Q3 prediction.In this paper,we develop an ensemble Machine Learning(ML)approach for Q8 PSSP to explore the performance of ensemble learning algorithms compared to that of individual ML algorithms in Q8 PSSP.The ensemble members considered for constructing the ensemble models are well known classifiers,namely SVM(Support Vector Machines),KNN(K-Nearest Neighbor),DT(Decision Tree),RF(Random Forest),and NB(Naïve Bayes),with two feature extraction techniques,namely LDA(Linear Discriminate Analysis)and PCA(Principal Component Analysis).Experiments have been conducted for evaluating the performance of single models and ensemble models,with PCA and LDA,in Q8 PSSP.The novelty of this paper lies in the introduction of ensemble learning in Q8 PSSP problem.The experimental results confirmed that ensemble ML models are more accurate than individual ML models.They also indicated that features extracted by LDA are more effective than those extracted by PCA.
文摘The secondary structure of a protein is critical for establishing a link between the protein primary and tertiary structures.For this reason,it is important to design methods for accurate protein secondary structure prediction.Most of the existing computational techniques for protein structural and functional prediction are based onmachine learning with shallowframeworks.Different deep learning architectures have already been applied to tackle protein secondary structure prediction problem.In this study,deep learning based models,i.e.,convolutional neural network and long short-term memory for protein secondary structure prediction were proposed.The input to proposed models is amino acid sequences which were derived from CulledPDB dataset.Hyperparameter tuning with cross validation was employed to attain best parameters for the proposed models.The proposed models enables effective processing of amino acids and attain approximately 87.05%and 87.47%Q3 accuracy of protein secondary structure prediction for convolutional neural network and long short-term memory models,respectively.
基金Financial support from the National Natural Science Foundation of China(Grant Nos.52225001 and 51978485)the State Key Laboratory for Pollution Control(China)is acknowledged.
文摘The deep-learning protein structure prediction method AlphaFold2 has garnered enormous attention beyond the realm of structural biology,for its groundbreaking contribution to solving the"protein foiding problem"In this perspective,we explore the connection between protein structure studies and environmental research,delving into the potential for addressing specific environmental challenges.Proteins are promising for environmental applications because of the functional diversity endowed by their structural complexity.However,structural studies on proteins with environmental significance remain scarce.Here,we present the opportunity to study proteins by advancing experimental determination and deep-learning prediction methods.Specifically,the latest progress in environmental research via cryogenic electron microscopy is highlighted.It allows us to determine the structure of protein complexes in their native state within cells at molecular resolution,revealing environmentally-associated structural dynamics.With the remarkable advancements in computational power and experimental resolution,the study of protein structure and dynamics has reached unprecedented depth and accuracy.These advancements will undoubtedly accelerate the establishment of comprehensive environmental protein structural and functional databases.Tremendous opportunities for protein engineering exist to enable innovative solutions for environmental applications,such as the degradation of persistent contaminants,and the recovery of valuable metals as well as rare earth elements.
基金Project supported by the National Natural Science Foundation of China (Grant Nos 90103031, 10474041, 90403120 and 10021001), and the Nonlinear Project (973) of the NSM.
文摘The folding dynamics and structural characteristics of peptides RTKAWNRQLYPEW (P1) and RTKQLYPEW (P2) are investigated by using all-atomic simulation procedure CHARMM in this work. The results show that P1, a segment of an antigen, has a folding motif of α-helix, whereas P2, which is derived by deleting four residues AWNR from peptide P1, prevents the formation of helix and presents a β-strand. And peptlde P1 experiences a more rugged energy landscape than peptide P2. From our results, it is inferred that the antibody CD8 cytolytic T lymphocyte prefers an antigen with a β-folding structure to that with an α-helical one.
文摘It has been well accepted that the folding energy landscape may resemble a funnel according to the theory of protein folding. This theory of "folding funnel" has been extensively studied and thought to play an important role in guiding the sampling process of the protein folding and refinement in protein structure prediction. Here, we have investigated the relationship between the "funnel likeness" of protein folding and the size/structure of the proteins based on a set of non-homologous proteins we have recently evaluated using a statistical mechanicsbased scoring function ITScorePro. It was found that larger proteins that consist of more helix/sheet structures tend to have a higher score-Root Mean Square Deviation(RMSD) correlation(or a more funnel like energy landscape).Another measurement in protein folding, Z-score, has also shown some correlation with the size of the proteins.As expected, proteins with a better "olding funnel likeness"(or score-RMSD correlation) tend to have a betterpredicted conformation with a lower RMSD from their native structures. These findings can be extremely valuable for the development and improvement of sampling and scoring algorithms for protein structure prediction.
基金supported by National Institutes of Health grants R21/R33-GM078601 and R01-GM100701
文摘Protein structure Quality Assessment(QA) is an essential component in protein structure prediction and analysis. The relationship between protein sequence and structure often serves as a basis for protein structure QA.In this work, we developed a new Hidden Markov Model(HMM) to assess the compatibility of protein sequence and structure for capturing their complex relationship. More specifically, the emission of the HMM consists of protein local structures in angular space, secondary structures, and sequence profiles. This model has two capabilities:(1) encoding local structure of each position by jointly considering sequence and structure information, and(2)assigning a global score to estimate the overall quality of a predicted structure, as well as local scores to assess the quality of specific regions of a structure, which provides useful guidance for targeted structure refinement. We compared the HMM model to state-of-art single structure quality assessment methods OPUSCA, DFIRE, GOAP,and RW in protein structure selection. Computational results showed our new score HMM.Z can achieve better overall selection performance on the benchmark datasets.
基金supported by the National Natural Science Foundation of China(No.90203011 and 30370354)the Ministry of Education of China(No.505010 and CG2003-GA002)
文摘The three-dimensional (3D) structure prediction of proteins :is an important task in bioinformatics. Finding energy functions that can better represent residue-residue and residue-solvent interactions is a crucial way to improve the prediction accu- racy. The widely used contact energy functions mostly only consider the contact frequency between different types of residues; however, we find that the contact frequency also relates to the residue hydrophobic environment. Accordingly, we present an improved contact energy function to integrate the two factors, which can reflect the influence of hydrophobic interaction on the stabilization of protein 3D structure more effectively. Furthermore, a fold recognition (threading) approach based on this energy function is developed. The testing results obtained with 20 randomly selected proteins demonstrate that, compared with common contact energy functions, the proposed energy function can improve the accuracy of the fold template prediction from 20% to 50%, and can also improve the accuracy of the sequence-template alignment from 35% to 65%.
基金Supported by the National Natural Science Foundation of China (60773010)
文摘Protein structure prediction is one of the most important problems in structural biology, β-turns are always at the turn of a protein tertiary structure and thus β-turn's prediction is a key step in tertiary structure prediction. There are some methods to predict β-turns based on machine learning techniques such as k-nearest method, neural networks and support vector machine. In this paper, we construct a classifier using double BP networks and put forward two novel methods to code amino acids in the second network. When trained and tested on different datasets, they achieve more accuracy than other coding methods.
文摘With the rapid development of multiple technologies from the Internet to mobile phones and cameras, visual data is now widely available in huge amounts and great variety, bringing significant opportunities for novel processing of visual information as well as commercial applications.