摘要
寡聚蛋白质广泛地参与多种生命活动,对其预测研究有重要的意义。文章从蛋白质序列出发,提出多策略滑动伸缩窗特征提取方法,采用"一对一"的多类分类策略,对蛋白质同源寡聚体进行预测研究。结果表明,在Jackknife检验下,基于支持向量机的多策略滑动伸缩窗特征和氨基酸组成成分构成的特征集在加权情况下,其总分类精度最高达到了75.37%,比单纯的氨基酸组成成分法提高10.05%,比参考文献最好特征BG_Zhang提高了3.82%。说明多策略滑动伸缩窗特征提取方法对于蛋白质同源寡聚体分类,是一种非常有效的特征提取方法。
Protein homo-oligomers play an important role in varous life processes .The concept of multi-strategy glide zoom window was proposed and a novel approach of multi-strategy glide zoom window feature extraction was used for predicting protein homo-oligomers. Based on the concept of multi-strategy glide zoom window, the authors chose two strategy glide zoom windows: whole protein sequence glide zoom window and kin amino acid glide zoom window, and for each strategy glide zoom window, three feature vectors of amino acids distance sum, amino acids mean distance and amino acids distribution, were extracted. A series of feature sets were constructed by combining these feature vectors with amino acids composition to form pseudo amino acid compositions (PseAAC). The support vector machine (SVM) was used as base classifier. The 75.37% total accuracy is arrived in jackknife test in the weighted factor conditions, which is 10.05% and 3.82% higher than that of conventional amino acid composition method and that of BG Zhang in the same condition. The results show that multi-strategy glide zoom window method of extracting feature vectors from protein sequence is effective and feasible, and the feature vectors of multi-strategy glide zoom window may contain more protein structure information.
出处
《生物物理学报》
CAS
CSCD
北大核心
2009年第5期335-342,共8页
Acta Biophysica Sinica
基金
国家自然科学基金项目(60775012
60634030)
西北工业大学科技创新项目(KC02)~~
关键词
同源寡聚体
支持向量机
特征提取
多策略滑动伸缩窗
多策略滑动伸缩窗特征
Homo-oligomers
Support vector machines (SVM)
Feature extraction
Multi-strategy glide zoom window
Multi-strategy glide zoom window features