期刊文献+

一种基于最优局部信息融合的蛋白质亚细胞定位预测方法 被引量:3

A Novel Approach for Prediction of Protein Subcellular Localization Using Optimal Local Information
下载PDF
导出
摘要 基于蛋白质的合成及分选机制,提出了一种新的蛋白质亚细胞定位预测方法。先采用遍历搜索技术,找出各种亚细胞蛋白质序列分选信号和成熟蛋白质之间的最佳分割位点,把蛋白质序列分为两条子序列,计算这两条子序列中的氨基酸组份并将它们融合起来作为整条蛋白质序列的特征,然后构造用于识别每类蛋白质的最佳子分类器,再根据最大化原则组建集成分类器。在NNPSL数据集上,采用5重交叉验证方法对本文方法进行测试,原核和真核两个蛋白质序列子集分别取得94.1%和87.5%的总体预测精度。同时,此方法在一些蛋白质序列中找到的分割位点与真实生物现象相吻合,能为预测蛋白质序列的剪切位点提供参考信息。 Prediction of protein subcellular localization can help infer the function of proteins and apply insight into the interaction between proteins. A novel approach based on the sorting mechanism of proteins, is proposed for predicting subcellular localization of proteins. An optimal splice site is found through iterative searching technique to divide the sequence into sorting signal and mature protein subsequenee for each kind of proteins. When designing the classifier, a sub-classifier is built to discriminate each kind of protein from the rest, these sub-classifiers are then combined into an ensemble classifier to predict the subcellular localization of unknown proteins. Through fivefold cross-validation tests on NNPSL datasets and TargetP datasets, overall accuracies of 94. 1% and 87.5% are obtained for prokaryotie and eukaryotie proteins respectively, as for TargetP datasets, the overall accuracies are 90. 2% and 93.9% for plant and non-plant proteins respectively. Meanwhile, the optimal splice sites found in this paper are coincided with the biological facts in most of kinds protein, this can help predict the cleavage sites of proteins.
出处 《中山大学学报(自然科学版)》 CAS CSCD 北大核心 2008年第6期16-21,共6页 Acta Scientiarum Naturalium Universitatis Sunyatseni
基金 国家自然科学基金资助项目(60675016 60633030)
关键词 亚细胞定位N端分选信号 成熟蛋白质 支持向量机 分割位点 subcellular localization N-terminal sorting signal mature protein support vector machine splice site
  • 相关文献

参考文献15

  • 1HOGLUND A, DONNES P, BLUM T, et al. MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition [ J ]. Bioinformatics, 2006,22 ( 10 ) : 1158 - 1165.
  • 2Evangelia I. Petsalaki,Pantelis G. Bagos,Zoi I. Litou,Stavros J. Hamodrakas.PredSL: A Tool for the N-terminal Sequence-based Prediction of Protein Subcellular Localization[J].Genomics, Proteomics & Bioinformatics,2006,4(1):48-55. 被引量:5
  • 3REINHARDT A, HUBBARD T. Using neural networks for prediction of the subcellular location of proteins [ J ]. Nucleic Acids Res, 1998,26 (9) :2230 - 2236.
  • 4HUA S J, SUN Z R. Support vector machine approach for protein subcellular location prediction [ J ]. Bioinformatics,2001,17 :721 - 728.
  • 5MATSUDA S, VERT J P, SAIGO H, et al. A novel representation of protein sequences for prediction of subcellular location using support vector machines [ J ]. Protein Sci ,2005,14:2804 - 2813.
  • 6GUO J,LIN Y, SUN Z. A novel method for protein subcellular localization: Combining residue-couple model and SVM[C]. Proceedings of the 3rd Asia-Pacific Bioin- formatics Conference, Singapore,2005,117 - 129.
  • 7CHOU K C, CAI Y D. Using functional domain composition and support vector machines for prediction of protein subcellular location [ J]. J Biol Chem,2002,277 (48) : 45765 - 45769.
  • 8SCOTT M S,THOMAS D Y, HALLETT M T. Predicting subcellular localization via protein motif co-occurrence [J]. Genome Res,2004,14 : 1957 - 1966.
  • 9XIE D, LI A, WANG M. LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST [ J ]. Nucleic Acids Res, 2005,33:105 - 110.
  • 10TAMURA T, AKUTSU T. Subcellular location prediction of proteins using support vector machines with alignment of block sequences utilizing amino acid composition [ J ]. BMC Bioinformatics,2007,8:466.

二级参考文献30

  • 1[1]Blobel,G.2000.Protein targeting (Nobel lecture).Chembiochem 1:86-102.
  • 2[2]Feng,Z.P.2002.An overview on predicting the subcellular location of a protein.In Silico Biol.2:291-303.
  • 3[3]Mott,R.,et al.2002.Predicting protein cellular localization using a domain projection method.Genome Res.12:1168-1174.
  • 4[4]Cedano,J.,et al.1997.Relation between amino acid composition and cellular location of proteins.J.Mol.Biol.266:594-600.
  • 5[5]Chou,K.C.2000.Prediction of protein subcellular locations by incorporating quasi-sequence-order effect.Biochem.Biophys.Res.Commun.278:477-483.
  • 6[6]Nair,R.and Rost,B.2003.Better prediction of subcellular localization by combining evolutionary and structural information.Proteins 53:917-930.
  • 7[7]Emanuelsson,O.,et al.2000.Predicting subcellular localization of proteins based on their N-terminal amino acid sequence.J.Mol.Biol.300:1005-1016.
  • 8[8]Chou,K.C.and Elrod,D.W.1999.Prediction of membrane protein types and subcellular locations.Proteins 34:137-153.
  • 9[9]Hiller,K.,et al.2004.PrediSi:prediction of signal peptides and their cleavage positions.Nucleic Acids Res.32:W375-379.
  • 10[10]Szafron,D.,et al.2004.Proteome Analyst:custom predictions with explanations in a web-based tool for high-throughput proteome annotations.Nucleic Acids Res.32:W365-371.

共引文献4

同被引文献51

引证文献3

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部