Protein evolution proceeds by two distinct processes: 1) individual mutation and selection for adaptive mutations and 2) rearrangement of entire domains within proteins into novel combinations, producing new protei...Protein evolution proceeds by two distinct processes: 1) individual mutation and selection for adaptive mutations and 2) rearrangement of entire domains within proteins into novel combinations, producing new protein families that combine functional properties in ways that previously did not exist. Domain rearrangement poses a challenge to sequence alignment-based search methods, such as BLAST, in predicting homology since the methodology implicitly assumes that related proteins primarily differ from each other by individual mutations. Moreover, there is ample evidence that the evolutionary process has used (and continues to use) domains as building blocks, therefore, it seems fit to utilize computational, domain-based methods to reconstruct that process. A challenge and opportunity for computational biology is how to use knowledge of evolutionary domain recombination to characterize families of proteins whose evolutionary history includes such recombination, to discover novel proteins, and to infer protein-protein interactions. In this paper we review techniques and databases that exploit our growing knowledge of “horizontal” protein evolution, and suggest possible areas of future development. We illustrate the power of the domain-based methods and the possible directions of future development by a case history in progress aiming at facilitating a particular approach to understanding microbial pathogenicity.展开更多
基金supported by NSF of USA under Grant Nos. 0835718 and 0235792NIH under Grant Nos. 5PN2EY016570-06 and5R01NS063405-02+2 种基金the Beckman Institute for Advanced Science and Technologythe National Center for Supercomputing Applicationsthe Renaissance Computing Institute
文摘Protein evolution proceeds by two distinct processes: 1) individual mutation and selection for adaptive mutations and 2) rearrangement of entire domains within proteins into novel combinations, producing new protein families that combine functional properties in ways that previously did not exist. Domain rearrangement poses a challenge to sequence alignment-based search methods, such as BLAST, in predicting homology since the methodology implicitly assumes that related proteins primarily differ from each other by individual mutations. Moreover, there is ample evidence that the evolutionary process has used (and continues to use) domains as building blocks, therefore, it seems fit to utilize computational, domain-based methods to reconstruct that process. A challenge and opportunity for computational biology is how to use knowledge of evolutionary domain recombination to characterize families of proteins whose evolutionary history includes such recombination, to discover novel proteins, and to infer protein-protein interactions. In this paper we review techniques and databases that exploit our growing knowledge of “horizontal” protein evolution, and suggest possible areas of future development. We illustrate the power of the domain-based methods and the possible directions of future development by a case history in progress aiming at facilitating a particular approach to understanding microbial pathogenicity.