Here,a new integrated machine learning and Chou’s pseudo amino acid composition method has been proposed for in silico epitope mapping of severe acute respiratorysyndrome-like coronavirus antigens.For this,a training...Here,a new integrated machine learning and Chou’s pseudo amino acid composition method has been proposed for in silico epitope mapping of severe acute respiratorysyndrome-like coronavirus antigens.For this,a training dataset including 266 linear B-cell epitopes,1,267 T-cell epitopes and 1,280 non-epitopes were prepared.The epitope sequences were then converted to numerical vectors using Chou’s pseudo amino acid composition method.The vectors were then introduced to the support vector machine,random forest,artificial neural network,and K-nearest neighbor algorithms for the classification process.The algorithm with the highest performance was selected for the epitope mapping procedure.Based on the obtained results,the random forest algorithm was the most accurate classifier with an accuracy of 0.934 followed by K-nearest neighbor,artificial neural network,and support vector machine respectively.Furthermore,the efficacies of predicted epitopes by the trained random forest algorithm were assessed through their antigenicity potential as well as affinity to human B cell receptor and MHC-I/II alleles using the VaxiJen score and molecular docking,respectively.It was also clear that the predicted epitopes especially the B-cell epitopes had high antigenicity potentials and good affinities to the protein targets.According to the results,the suggested method can be considered for developing specific epitope predictor software as well as an accelerator pipeline for designing serotype independent vaccine against the virus.展开更多
Detecting remote homology proteins is a challenging problem for both basic research and drug development. Although there are a couple of methods to deal with this problem, the benchmark datasets based on which the exi...Detecting remote homology proteins is a challenging problem for both basic research and drug development. Although there are a couple of methods to deal with this problem, the benchmark datasets based on which the existing methods were trained and tested contain many high homologous samples as reflected by the fact that the cutoff threshold was set at 95%. In this study, we reconstructed the benchmark dataset by setting the threshold at 40%, meaning none of the proteins included in the benchmark dataset has more than 40% pairwise sequence identity with any other in the same subset. Using the new benchmark dataset, we proposed a new predictor called “dRHP-GreyFun” based on the grey modeling and functional domain approach. Rigorous cross-validations have indicated that the new predictor is superior to its counterparts in both enhancing success rates and reducing computational cost. The predictor can be downloaded from https://github.com/jcilwz/dRHP-GreyFun.展开更多
It has been a dream that theoretical biology can be extensively applied in experimental biology to accelerate the understanding of the sophiscated movements in living organisms. A brave assay and an excellent example ...It has been a dream that theoretical biology can be extensively applied in experimental biology to accelerate the understanding of the sophiscated movements in living organisms. A brave assay and an excellent example were represented by enzymology, in which the well-established physico-chemistry is used to describe, to fit, to predict and to improve enzyme reactions. Before the modern bioinformatics, the developments of the combination of theoretical biology and experimental biology have been mainly limited to various classic formulations. The systematic use of graphic rules by Prof. Kuo-Chen Chou and his co-workers has significantly facilitated to deal with complicated enzyme systems. With the recent fast progress of bioinformatics, prediction of protein structures and various protein attributes have been well established by Chou and co-workers, stimulating the experimental biology. For example, their recent method for predicting protein subcellular localization (one of the important attributes of proteins) has been extensively applied by scientific colleagues, yielding many new results with thousands of citations. The research by Prof. Chou is characterized by introducing novel physical concepts as well as powerful and elegant mathematical methods into important biomedical problems, a focus throughout his career, even when facing enormous difficulties. His efforts in 50 years have greatly helped us to realize the dream to make “theoretical and experimental biology in one”. Prof. Richard Giege is well known for his multi-disciplinary research combining physics, chemistry, enzymology and molecular biology. His major focus of study is on the identity of tRNAs and their interactions with aminoacyl-tRNA synthetases (aaRS), which are of critical importance to the fidelity of protein biosynthesis. He and his colleagues have carried out the first crystallization of a tRNA/aaRS complex, that between tRNAAsp and AspRS from yeast. The determination of the complex structure contributed significantly to under- stand the interaction of protein and RNA. From his fine research, they have also found other biological function of these small RNAs. He has developed in parallel appropriate methods for his research, of which the protein crystallogenesis, a name he has coined, is an excellent example. Now macromolecular crystallogenesis has become a developed science. In fact, such contribution has accelerated the development of protein crystallography, stimulating the study of macromolecular structure and function.展开更多
Glycation is a non-enzymatic post-translational modification which assigns sugar molecule and residues to a peptide.It is a clinically important attribute to numerous age-related,metabolic,and chronic diseases such as...Glycation is a non-enzymatic post-translational modification which assigns sugar molecule and residues to a peptide.It is a clinically important attribute to numerous age-related,metabolic,and chronic diseases such as diabetes,Alzheimer’s,renal failure,etc.Identification of a non-enzymatic reaction are quite challenging in research.Manual identification in labs is a very costly and timeconsuming process.In this research,we developed an accurate,valid,and a robust model named as Gly-LysPred to differentiate the glycated sites from non-glycated sites.Comprehensive techniques using position relative features are used for feature extraction.An algorithm named as a random forest with some preprocessing techniques and feature engineering techniques was developed to train a computational model.Various types of testing techniques such as self-consistency testing,jackknife testing,and cross-validation testing are used to evaluate the model.The overall model’s accuracy was accomplished through self-consistency,jackknife,and cross-validation testing 100%,99.92%,and 99.88%with MCC 1.00,0.99,and 0.997 respectively.In this regard,a user-friendly webserver is also urbanized to accumulate the whole procedure.These features vectorization methods suggest that they can play a critical role in other web servers which are developed to classify lysine glycation.展开更多
Frank Chou is the Chairman of Australian China Group Development Pty Ltd that has direct or indirect investments in many businesses which subsidiar- ies such as Evershine Australia Trading(exclusive importer of Mao-ta...Frank Chou is the Chairman of Australian China Group Development Pty Ltd that has direct or indirect investments in many businesses which subsidiar- ies such as Evershine Australia Trading(exclusive importer of Mao-tai in Asia Pacific countries),Handpicked Wines International Pty Ltd and Two Eights(Australia)Pty Ltd(world famous wine producers).By looking at the dyna- mitic Frank,one could hardly believe that Frank is over 70 as he is just so dynamic and animated all the time:he is sharp, thinks quickly,and moves fast.He does not like to waste time and once said'Every second of my life counts.I will not rest until my last breath'.This is the same energy he devoted to developing his passion for wine.展开更多
Chou model was used to investigate the dehydriding reaction kinetic mechanism of MgH_2-Nb_2O_5 hydrogen storage materials at 573 K.A new conception,'characteristic absorption/desorption time(t_c)'was introduce...Chou model was used to investigate the dehydriding reaction kinetic mechanism of MgH_2-Nb_2O_5 hydrogen storage materials at 573 K.A new conception,'characteristic absorption/desorption time(t_c)'was introduced to characterize the reaction rate The fitting results show that for the hydrogen desorbing mechanism,the surface penetration is the rate-controlling step.The mechanism remains the same even when the original panicle size of Nb_2O_5 is before ball milling(BM) or when the BM time changes And t_c indicates that the desorption rate of MgH_2-Nb_2O_5 will be faster than that of MgH_2-Nb_2O_5 by BM.The dehydriding reaction rate of MgH_2-Nb_2O_5(micro particle) BMed for 50 h is 4.76 times faster than that of the MgH_2-Nb_2O_5(micro panicle) BMed for 0.25 h,while the dehydriding reaction rate of MgH_2-Nb_2O_5(nano particle) BMed for 50 h is only 1.18 times as that of the MgH2-Nb_2O_5 (nano particle) BMed for 0.25 h.The dehydriding reaction rate of the BMed MgH_2-Nb_2O_5(nano particle) is 1-9 times faster than that of the BMed MgH_2-Nb_2O_5(micro particle).展开更多
文摘Here,a new integrated machine learning and Chou’s pseudo amino acid composition method has been proposed for in silico epitope mapping of severe acute respiratorysyndrome-like coronavirus antigens.For this,a training dataset including 266 linear B-cell epitopes,1,267 T-cell epitopes and 1,280 non-epitopes were prepared.The epitope sequences were then converted to numerical vectors using Chou’s pseudo amino acid composition method.The vectors were then introduced to the support vector machine,random forest,artificial neural network,and K-nearest neighbor algorithms for the classification process.The algorithm with the highest performance was selected for the epitope mapping procedure.Based on the obtained results,the random forest algorithm was the most accurate classifier with an accuracy of 0.934 followed by K-nearest neighbor,artificial neural network,and support vector machine respectively.Furthermore,the efficacies of predicted epitopes by the trained random forest algorithm were assessed through their antigenicity potential as well as affinity to human B cell receptor and MHC-I/II alleles using the VaxiJen score and molecular docking,respectively.It was also clear that the predicted epitopes especially the B-cell epitopes had high antigenicity potentials and good affinities to the protein targets.According to the results,the suggested method can be considered for developing specific epitope predictor software as well as an accelerator pipeline for designing serotype independent vaccine against the virus.
文摘Detecting remote homology proteins is a challenging problem for both basic research and drug development. Although there are a couple of methods to deal with this problem, the benchmark datasets based on which the existing methods were trained and tested contain many high homologous samples as reflected by the fact that the cutoff threshold was set at 95%. In this study, we reconstructed the benchmark dataset by setting the threshold at 40%, meaning none of the proteins included in the benchmark dataset has more than 40% pairwise sequence identity with any other in the same subset. Using the new benchmark dataset, we proposed a new predictor called “dRHP-GreyFun” based on the grey modeling and functional domain approach. Rigorous cross-validations have indicated that the new predictor is superior to its counterparts in both enhancing success rates and reducing computational cost. The predictor can be downloaded from https://github.com/jcilwz/dRHP-GreyFun.
文摘It has been a dream that theoretical biology can be extensively applied in experimental biology to accelerate the understanding of the sophiscated movements in living organisms. A brave assay and an excellent example were represented by enzymology, in which the well-established physico-chemistry is used to describe, to fit, to predict and to improve enzyme reactions. Before the modern bioinformatics, the developments of the combination of theoretical biology and experimental biology have been mainly limited to various classic formulations. The systematic use of graphic rules by Prof. Kuo-Chen Chou and his co-workers has significantly facilitated to deal with complicated enzyme systems. With the recent fast progress of bioinformatics, prediction of protein structures and various protein attributes have been well established by Chou and co-workers, stimulating the experimental biology. For example, their recent method for predicting protein subcellular localization (one of the important attributes of proteins) has been extensively applied by scientific colleagues, yielding many new results with thousands of citations. The research by Prof. Chou is characterized by introducing novel physical concepts as well as powerful and elegant mathematical methods into important biomedical problems, a focus throughout his career, even when facing enormous difficulties. His efforts in 50 years have greatly helped us to realize the dream to make “theoretical and experimental biology in one”. Prof. Richard Giege is well known for his multi-disciplinary research combining physics, chemistry, enzymology and molecular biology. His major focus of study is on the identity of tRNAs and their interactions with aminoacyl-tRNA synthetases (aaRS), which are of critical importance to the fidelity of protein biosynthesis. He and his colleagues have carried out the first crystallization of a tRNA/aaRS complex, that between tRNAAsp and AspRS from yeast. The determination of the complex structure contributed significantly to under- stand the interaction of protein and RNA. From his fine research, they have also found other biological function of these small RNAs. He has developed in parallel appropriate methods for his research, of which the protein crystallogenesis, a name he has coined, is an excellent example. Now macromolecular crystallogenesis has become a developed science. In fact, such contribution has accelerated the development of protein crystallography, stimulating the study of macromolecular structure and function.
基金the Research Management Center,Xiamen University Malaysia under XMUM Research Program Cycle 4(Grant No.XMUMRF/2019-C4/IECE/0012).
文摘Glycation is a non-enzymatic post-translational modification which assigns sugar molecule and residues to a peptide.It is a clinically important attribute to numerous age-related,metabolic,and chronic diseases such as diabetes,Alzheimer’s,renal failure,etc.Identification of a non-enzymatic reaction are quite challenging in research.Manual identification in labs is a very costly and timeconsuming process.In this research,we developed an accurate,valid,and a robust model named as Gly-LysPred to differentiate the glycated sites from non-glycated sites.Comprehensive techniques using position relative features are used for feature extraction.An algorithm named as a random forest with some preprocessing techniques and feature engineering techniques was developed to train a computational model.Various types of testing techniques such as self-consistency testing,jackknife testing,and cross-validation testing are used to evaluate the model.The overall model’s accuracy was accomplished through self-consistency,jackknife,and cross-validation testing 100%,99.92%,and 99.88%with MCC 1.00,0.99,and 0.997 respectively.In this regard,a user-friendly webserver is also urbanized to accumulate the whole procedure.These features vectorization methods suggest that they can play a critical role in other web servers which are developed to classify lysine glycation.
文摘Frank Chou is the Chairman of Australian China Group Development Pty Ltd that has direct or indirect investments in many businesses which subsidiar- ies such as Evershine Australia Trading(exclusive importer of Mao-tai in Asia Pacific countries),Handpicked Wines International Pty Ltd and Two Eights(Australia)Pty Ltd(world famous wine producers).By looking at the dyna- mitic Frank,one could hardly believe that Frank is over 70 as he is just so dynamic and animated all the time:he is sharp, thinks quickly,and moves fast.He does not like to waste time and once said'Every second of my life counts.I will not rest until my last breath'.This is the same energy he devoted to developing his passion for wine.
基金Project(2006AZ001)supported by the Shanghai Municipal Education Commission,ChinaProject(06JC14031)supported by the Scienceand Technology Commission of Shanghai Municipality,China+2 种基金Project(06QA14021)supported by the Shanghai Rising-Star Program(Atype),ChinaProject(200746)supported by the Foundation for the Author of National Excellent Doctoral Dissertation of ChinaProjectsupported by the Innovation Fund for Graduate Student of Shanghai University,China
文摘Chou model was used to investigate the dehydriding reaction kinetic mechanism of MgH_2-Nb_2O_5 hydrogen storage materials at 573 K.A new conception,'characteristic absorption/desorption time(t_c)'was introduced to characterize the reaction rate The fitting results show that for the hydrogen desorbing mechanism,the surface penetration is the rate-controlling step.The mechanism remains the same even when the original panicle size of Nb_2O_5 is before ball milling(BM) or when the BM time changes And t_c indicates that the desorption rate of MgH_2-Nb_2O_5 will be faster than that of MgH_2-Nb_2O_5 by BM.The dehydriding reaction rate of MgH_2-Nb_2O_5(micro particle) BMed for 50 h is 4.76 times faster than that of the MgH_2-Nb_2O_5(micro panicle) BMed for 0.25 h,while the dehydriding reaction rate of MgH_2-Nb_2O_5(nano particle) BMed for 50 h is only 1.18 times as that of the MgH2-Nb_2O_5 (nano particle) BMed for 0.25 h.The dehydriding reaction rate of the BMed MgH_2-Nb_2O_5(nano particle) is 1-9 times faster than that of the BMed MgH_2-Nb_2O_5(micro particle).