摘要
目的:基于已知药物靶点和潜在药物靶点蛋白的一级结构相似性,结合SVM技术研究新的有效的药物靶点预测方法。方法:构造训练样本集,提取蛋白质序列的一级结构特征,进行数据预处理,选择最优核函数,优化参数并进行特征选择,训练最优预测模型,检验模型的预测效果。以G蛋白偶联受体家族的蛋白质为预测集,应用建立的最优分类模型对其进行潜在药物靶点挖掘。结果:基于SVM所建立的最优分类模型预测的平均准确率为81.03%。应用最优分类器对构造的G蛋白预测集进行预测,结果发现预测排位在前20的蛋白质中有多个与疾病相关。特别的,其中有两个G蛋白在治疗靶点数据库(TTD)中显示已作为临床试验的药物靶点。结论:基于SVM和蛋白质序列特征的药物靶点预测方法是有效的,应用该方法预测出的潜在药物靶点能够为发现新的药靶提供参考。
Objective: Combined with the SVM technology to study a new and effective method of drug target prediction based on the protein primary structure similarity of known drug targets and potential drug targets.Methods: To construct the training set,extract the primary structure characteristics of protein sequences,and preprocess the data,then select the optimal kernel function and parameters,finally do feature selection,and train the best forecasting model,test its effect.Apply the optimal prediction model on the prediction set that composed of GPCRS for potential drug targets mining.Results: The average prediction accuracy of the optimal classification model that based on SVM is 81.03%.Applying the optimal classification model on the prediction set of G-proteins,then we find that some of the proteins that ranking top 20 of the prediction results are related with some certain diseases.Particularly,there are two G-proteins seem as clinical trials drug targets in the therapeutic target database(TTD).Conclusions: This drug target prediction method based on SVM and protein sequence features is effective,and the application of this method to predict potential drug targets can provide a valuable reference for the discovery of new drug targets.
出处
《现代生物医学进展》
CAS
2012年第20期3943-3947,共5页
Progress in Modern Biomedicine
基金
国家自然科学基金(G81172842
G61170154)