The capacity of zoonotic influenza to cross species boundaries to infect humans poses a global health threat. A previous study identified sites in 10 influenza proteins that characterize the host shifts from avian to ...The capacity of zoonotic influenza to cross species boundaries to infect humans poses a global health threat. A previous study identified sites in 10 influenza proteins that characterize the host shifts from avian to human influenza. Here, we used seven feature selection algorithms based on machine learning techniques to generate a novel and extensive selection of diverse sites from the nine internal proteins of influenza based on statistically importance to differentiating avian from human viruses. A set of 131 sites was generated by processing each protein independently, and a selection of 113 sites was found by analyzing a concatenation of sequences from all nine proteins. These new sites were analyzed according to their annual mutational trends. The correlation of each site with all other sites (one-to-many) and the connectivity within groups of specific sites (one-to-one) were identified. We compared the performance of these new sites evaluated by four classifiers against those recorded in previous research, and found our sites to be better suited to host distinction in all but one protein, validating the significance of our site selection. Our findings indicated that, in our selection of sites, human influenza tended to mutate more than avian influenza. Despite this, the correlation and connectivity between the avian sites was stronger than that of the human sites, and the percentage of sites with high connectivity was also greater in avian influenza.展开更多
文摘The capacity of zoonotic influenza to cross species boundaries to infect humans poses a global health threat. A previous study identified sites in 10 influenza proteins that characterize the host shifts from avian to human influenza. Here, we used seven feature selection algorithms based on machine learning techniques to generate a novel and extensive selection of diverse sites from the nine internal proteins of influenza based on statistically importance to differentiating avian from human viruses. A set of 131 sites was generated by processing each protein independently, and a selection of 113 sites was found by analyzing a concatenation of sequences from all nine proteins. These new sites were analyzed according to their annual mutational trends. The correlation of each site with all other sites (one-to-many) and the connectivity within groups of specific sites (one-to-one) were identified. We compared the performance of these new sites evaluated by four classifiers against those recorded in previous research, and found our sites to be better suited to host distinction in all but one protein, validating the significance of our site selection. Our findings indicated that, in our selection of sites, human influenza tended to mutate more than avian influenza. Despite this, the correlation and connectivity between the avian sites was stronger than that of the human sites, and the percentage of sites with high connectivity was also greater in avian influenza.