Single nucletide polymorphism(SNP)is an important factor for the study of genetic variation in human families and animal and plant strains.Therefore,it is widely used in the study of population genetics and disease re...Single nucletide polymorphism(SNP)is an important factor for the study of genetic variation in human families and animal and plant strains.Therefore,it is widely used in the study of population genetics and disease related gene.In pharmacogenomics research,identifying the association between SNP site and drug is the key to clinical precision medication,therefore,a predictive model of SNP site and drug association based on denoising variational auto-encoder(DVAE-SVM)is proposed.Firstly,k-mer algorithm is used to construct the initial SNP site feature vector,meanwhile,MACCS molecular fingerprint is introduced to generate the feature vector of the drug module.Then,we use the DVAE to extract the effective features of the initial feature vector of the SNP site.Finally,the effective feature vector of the SNP site and the feature vector of the drug module are fused input to the support vector machines(SVM)to predict the relationship of SNP site and drug module.The results of five-fold cross-validation experiments indicate that the proposed algorithm performs better than random forest(RF)and logistic regression(LR)classification.Further experiments show that compared with the feature extraction algorithms of principal component analysis(PCA),denoising auto-encoder(DAE)and variational auto-encode(VAE),the proposed algorithm has better prediction results.展开更多
基金Lanzhou Talent Innovation and Entrepreneurship Project(No.2020-RC-14)。
文摘Single nucletide polymorphism(SNP)is an important factor for the study of genetic variation in human families and animal and plant strains.Therefore,it is widely used in the study of population genetics and disease related gene.In pharmacogenomics research,identifying the association between SNP site and drug is the key to clinical precision medication,therefore,a predictive model of SNP site and drug association based on denoising variational auto-encoder(DVAE-SVM)is proposed.Firstly,k-mer algorithm is used to construct the initial SNP site feature vector,meanwhile,MACCS molecular fingerprint is introduced to generate the feature vector of the drug module.Then,we use the DVAE to extract the effective features of the initial feature vector of the SNP site.Finally,the effective feature vector of the SNP site and the feature vector of the drug module are fused input to the support vector machines(SVM)to predict the relationship of SNP site and drug module.The results of five-fold cross-validation experiments indicate that the proposed algorithm performs better than random forest(RF)and logistic regression(LR)classification.Further experiments show that compared with the feature extraction algorithms of principal component analysis(PCA),denoising auto-encoder(DAE)and variational auto-encode(VAE),the proposed algorithm has better prediction results.