Computational design of proteins is a relatively new field, where scientists search the enormous sequence space for sequences that can fold into desired structure and perform desired functions. With the computational ...Computational design of proteins is a relatively new field, where scientists search the enormous sequence space for sequences that can fold into desired structure and perform desired functions. With the computational approach, proteins can be designed, for example, as regulators of biological processes, novel enzymes, or as biotherapeutics. These approaches not only provide valuable information for understanding of sequence-structure-function relations in proteins, but also hold promise for applications to protein engineering and biomedical research. In this review, we briefly introduce the rationale for computational protein design, then summarize the recent progress in this field, including de novo protein design, enzyme design, and design of protein-protein interactions. Challenges and future prospects of this field are also discussed.展开更多
The protein inverse folding problem,designing amino acid sequences that fold into desired protein structures,is a critical challenge in biological sciences.Despite numerous data-driven and knowledge-driven methods,the...The protein inverse folding problem,designing amino acid sequences that fold into desired protein structures,is a critical challenge in biological sciences.Despite numerous data-driven and knowledge-driven methods,there remains a need for a user-friendly toolkit that effectively integrates these approaches for in-silico protein design.In this paper,we present DIProT,an interactive protein design toolkit.DIProT leverages a non-autoregressive deep generative model to solve the inverse folding problem,combined with a protein structure prediction model.This integration allows users to incorporate prior knowledge into the design process,evaluate designs in silico,and form a virtual design loop with human feedback.Our inverse folding model demonstrates competitive performance in terms of effectiveness and efficiency on TS50 and CATH4.2 datasets,with promising sequence recovery and inference time.Case studies further illustrate how DIProT can facilitate user-guided protein design.展开更多
基金supported by the National Basic Research Program of China(Grant No.2015CB910300)the National High Technology Research and Development Program of China(Grant No.2012AA020308)the National Natural Science Foundation of China(Grant No.11021463)
文摘Computational design of proteins is a relatively new field, where scientists search the enormous sequence space for sequences that can fold into desired structure and perform desired functions. With the computational approach, proteins can be designed, for example, as regulators of biological processes, novel enzymes, or as biotherapeutics. These approaches not only provide valuable information for understanding of sequence-structure-function relations in proteins, but also hold promise for applications to protein engineering and biomedical research. In this review, we briefly introduce the rationale for computational protein design, then summarize the recent progress in this field, including de novo protein design, enzyme design, and design of protein-protein interactions. Challenges and future prospects of this field are also discussed.
基金This work was supported by the National Natural Science Foundation of China(Nos.62250007,62225307,61721003)a grant from the Guoqiang Institute,Tsinghua University(2021GQG1023).
文摘The protein inverse folding problem,designing amino acid sequences that fold into desired protein structures,is a critical challenge in biological sciences.Despite numerous data-driven and knowledge-driven methods,there remains a need for a user-friendly toolkit that effectively integrates these approaches for in-silico protein design.In this paper,we present DIProT,an interactive protein design toolkit.DIProT leverages a non-autoregressive deep generative model to solve the inverse folding problem,combined with a protein structure prediction model.This integration allows users to incorporate prior knowledge into the design process,evaluate designs in silico,and form a virtual design loop with human feedback.Our inverse folding model demonstrates competitive performance in terms of effectiveness and efficiency on TS50 and CATH4.2 datasets,with promising sequence recovery and inference time.Case studies further illustrate how DIProT can facilitate user-guided protein design.