摘要
Influenza A viruses have led several pandemics and epidemics in human history. H7 subtype influenza mainly infects avian but also humans occasionally. Since the outbreak of H7N9 subtype influenza occurred in China in 2013, this virus is still circulating in domestic poultry and leading several waves of influenza. To prevent influenza, vaccination is an important strategy. However, influenza virus evolves constantly, but unpredictably. If we would have a one-to-one cause-mutation relationship, the mutation prediction would be possible. However, many external causes, which led to the mutations in the past, might not leave any trace due to the change in environments, whereas the current virus might not be subject to the historically external causes because of evolution. Furthermore, the protein should have the internal causes, which might be quite unclear and difficult to quantify, to engineer mutations. Indeed, various forces twist proteins into 3-demensional structures, whereas any perturbation could lead to a mutation. Of various internal causes for mutation, randomness in protein primary structure should play an important role in mutation. Over years, we have developed three methods to quantify the randomness within a protein primary structure;thus we build a relationship between cause, which is randomness in primary structure, and mutations, which are occurrence and non-occurrence of mutation. In this way, the cause-mutation relationship becomes the problem of classification, which can be solved using logistic regression and neural network. In this study, we apply this model to predict 1) the mutation positions in H7 hemagglutinins from influenza A virus and 2) the would-be-mutated amino-acids at predicted positions with the amino-acid mutating probability. The results show suitability and predictability in such modelling, and pave the way for further development.
Influenza A viruses have led several pandemics and epidemics in human history. H7 subtype influenza mainly infects avian but also humans occasionally. Since the outbreak of H7N9 subtype influenza occurred in China in 2013, this virus is still circulating in domestic poultry and leading several waves of influenza. To prevent influenza, vaccination is an important strategy. However, influenza virus evolves constantly, but unpredictably. If we would have a one-to-one cause-mutation relationship, the mutation prediction would be possible. However, many external causes, which led to the mutations in the past, might not leave any trace due to the change in environments, whereas the current virus might not be subject to the historically external causes because of evolution. Furthermore, the protein should have the internal causes, which might be quite unclear and difficult to quantify, to engineer mutations. Indeed, various forces twist proteins into 3-demensional structures, whereas any perturbation could lead to a mutation. Of various internal causes for mutation, randomness in protein primary structure should play an important role in mutation. Over years, we have developed three methods to quantify the randomness within a protein primary structure;thus we build a relationship between cause, which is randomness in primary structure, and mutations, which are occurrence and non-occurrence of mutation. In this way, the cause-mutation relationship becomes the problem of classification, which can be solved using logistic regression and neural network. In this study, we apply this model to predict 1) the mutation positions in H7 hemagglutinins from influenza A virus and 2) the would-be-mutated amino-acids at predicted positions with the amino-acid mutating probability. The results show suitability and predictability in such modelling, and pave the way for further development.
作者
Shaomin Yan
Guang Wu
Shaomin Yan;Guang Wu(National Engineering Research Center for Non-Food Biorefinery, State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Biomass Engineering Technology Research Center, Guangxi Key Laboratory of Bio-Refinery, Nanning, China)