Proteins function as integral actors in essential life processes,rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investig...Proteins function as integral actors in essential life processes,rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investigation.Within the context of protein research,an imperious demand arises to uncover protein functionalities and untangle intricate mechanistic underpinnings.Due to the exorbitant costs and limited throughput inherent in experimental investigations,computational models offer a promising alternative to accelerate protein function annotation.In recent years,protein pre-training models have exhibited noteworthy advancement across multiple prediction tasks.This advancement highlights a notable prospect for effectively tackling the intricate downstream task associated with protein function prediction.In this review,we elucidate the historical evolution and research paradigms of computational methods for predicting protein function.Subsequently,we summarize the progress in protein and molecule representation as well as feature extraction techniques.Furthermore,we assess the performance of machine learning-based algorithms across various objectives in protein function prediction,thereby offering a comprehensive perspective on the progress within this field.展开更多
In the post-genomic era, various computational methods that predict proteinprotein interactions at the genome level are available; however, each method has its own advantages and disadvantages, resulting in false pred...In the post-genomic era, various computational methods that predict proteinprotein interactions at the genome level are available; however, each method has its own advantages and disadvantages, resulting in false predictions. Here we developed a unique integrated approach to identify interacting partner(s) of Semaphorin 5A (SEMA5A), beginning with seven proteins sharing similar ligand interacting residues as putative binding partners. The methods include Dwyer and Root- Bernstein/Dillon theories of protein evolution, hydropathic complementarity of protein structure, pattern of protein functions among molecules, information on domain-domain interactions, co-expression of genes and protein evolution. Among the set of seven proteins selected as putative SEMA5A interacting partners, we found the functions of Plexin B3 and Neuropilin-2 to be associated with SEMA5A. We modeled the semaphorin domain structure of Plexin B3 and found that it shares similarity with SEMA5A. Moreover, a virtual expression database search and RT-PCR analysis showed co-expression of SEMA5A and Plexin B3 and these proteins were found to have co-evolved. In addition, we confirmed the interaction of SEMA5A with Plexin B3 in co-immunoprecipitation studies. Overall, these studies demonstrate that an integrated method of prediction can be used at the genome level for discovering many unknown protein binding partners with known ligand binding domains.展开更多
基金supported in part by the National Natural Science Foundation of China(22033001)the National Key R&D Program of China(2022YFA1303700)the Chinese Academy of Medical Sciences(2021-I2M-5-014).
文摘Proteins function as integral actors in essential life processes,rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investigation.Within the context of protein research,an imperious demand arises to uncover protein functionalities and untangle intricate mechanistic underpinnings.Due to the exorbitant costs and limited throughput inherent in experimental investigations,computational models offer a promising alternative to accelerate protein function annotation.In recent years,protein pre-training models have exhibited noteworthy advancement across multiple prediction tasks.This advancement highlights a notable prospect for effectively tackling the intricate downstream task associated with protein function prediction.In this review,we elucidate the historical evolution and research paradigms of computational methods for predicting protein function.Subsequently,we summarize the progress in protein and molecule representation as well as feature extraction techniques.Furthermore,we assess the performance of machine learning-based algorithms across various objectives in protein function prediction,thereby offering a comprehensive perspective on the progress within this field.
基金This work was partly supported by Molecular Therapeutics Program,Nebraska Department of Health and Human Services and by Grant CA72781 (to RKS)Cancer Center Support Grant (P30CA036727) from National Cancer Institute,National Institutes of Health,USA.
文摘In the post-genomic era, various computational methods that predict proteinprotein interactions at the genome level are available; however, each method has its own advantages and disadvantages, resulting in false predictions. Here we developed a unique integrated approach to identify interacting partner(s) of Semaphorin 5A (SEMA5A), beginning with seven proteins sharing similar ligand interacting residues as putative binding partners. The methods include Dwyer and Root- Bernstein/Dillon theories of protein evolution, hydropathic complementarity of protein structure, pattern of protein functions among molecules, information on domain-domain interactions, co-expression of genes and protein evolution. Among the set of seven proteins selected as putative SEMA5A interacting partners, we found the functions of Plexin B3 and Neuropilin-2 to be associated with SEMA5A. We modeled the semaphorin domain structure of Plexin B3 and found that it shares similarity with SEMA5A. Moreover, a virtual expression database search and RT-PCR analysis showed co-expression of SEMA5A and Plexin B3 and these proteins were found to have co-evolved. In addition, we confirmed the interaction of SEMA5A with Plexin B3 in co-immunoprecipitation studies. Overall, these studies demonstrate that an integrated method of prediction can be used at the genome level for discovering many unknown protein binding partners with known ligand binding domains.