期刊文献+

基于三类特征融合的O-糖基化位点预测 被引量:1

Predicting O-glycosylation Sites by Combining Three Different Types of Features
下载PDF
导出
摘要 糖基化是蛋白质翻译后的主要修饰,O-糖基化的固定模式未知,高精度识别O-糖基化位点是机器学习面临的挑战性问题.以迄今最大的人O-糖基化位点Steentoft数据集为基础,本文首次提出了基于位置的卡方差表特征χ^2pos,融合伪氨基酸序列进化信息Pse PSSM以及无方向的k间隔氨基酸对组分Undirected-CKSAAP表征序列,构建5个正负样本均衡的支持向量机分类器,经加权投票,独立测试准确率、Matthew相关系数及ROC曲线下面积,分别达到了89.62%、0.79、0.96,明显优于文献报道结果.χ^2pos、Pse PSSM与Undirected-CKSAAP三种特征的融合在蛋白质糖基化、磷酸化等位点预测中有广泛应用前景. Glycosylation is a major modification process in post-translational modification of protein.Accurate prediction of O-linked glycosylation sites is a big challenging faced by machine-learning,for the fixed-model of O-linked glycosylation is not yet known.In this paper,on the basis of the largest-ever Steentoft database up to now,a new feature——chi-square score difference table method based on position(χ^2-pos) was first proposed,which combined with pseudo position-specific scoring matrix(Pse PSSM) and undirected composition of k-spaced amino acid pairs(Undirected-CKSAAP) were used to present protein sequences.Then 5 support vector machines models were constructed with the same proportion of positive and negative samples.At last,by weighted voting,our results showed that the prediction accuracy,Matthew's correlation coefficient and area under ROC curve reached89.62%,0.79 and 0.96 respectively.They were superior to the literature report.It also demonstrated that the combination of three different features χ^2-pos,Pse PSSM and Undirected-CKSAAP has extensive application prospect in protein sites prediction such as glycosylation and phosphorylation.
出处 《生物化学与生物物理进展》 SCIE CAS CSCD 北大核心 2016年第7期691-698,共8页 Progress In Biochemistry and Biophysics
基金 高等学校博士学科点专项科研基金(20124320110002) 湖南省自然科学基金(14JJ2082) 长沙市科技计划项目(K1406018-21)资助
关键词 O-糖基化位点预测 卡方差表特征 伪氨基酸序列进化信息 无方向的k间隔氨基酸对组分 加权投票 O-glycosylation prediction chi-square score difference table pseudo position-specific scoring matrix undirected composition of k-spaced amino acid pairs weighted voting
  • 相关文献

参考文献29

  • 1Apweiler R, Hermjakob H, Sharon N. On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database[J]. Biochimica et Biophysica Acta (BBA)-General Subjects,1999, 1473(1): 4–8.
  • 2Geoghegan K F, Song X, Hoth L R, et al. Unexpected mucin-type O-glycosylation and host-specific N-glycosylation of human recombinant interleukin-17A expressed in a human kidney cell line. Protein Expression and Purification,2013, 87(1): 27–34.
  • 3Gill D J, Chia J, Senewiratne J, et al. Regulation of O-glycosylation through Golgi-to-ER relocation of initiation enzymes. The Journal of Cell Biology,2010, 189(5): 843–858.
  • 4Katrine T B G S, Clausen H. Site-specific protein O-glycosylation modulates proprotein processing-deciphering specific functions of the large polypeptide GalNAc-transferase gene family. Biochimica et Biophysica Acta (BBA)-General Subjects,2012, 1820(12): 2079–2094.
  • 5Blom N, Sicheritz-Pontén T, Gupta R, et al. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics,2004, 4(6): 1633–1649.
  • 6Hart G W. Glycosylation. Current Opinion in Cell Biology,1992, 4(6): 1017–1023.
  • 7Wilson I B, Gavel Y, Von Heijne G. Amino acid distributions around O-linked glycosylation sites. Biochem. J,1991, 275(2): 529–534.
  • 8Christlet T H T, Veluraja K. Database analysis of O-glycosylation sites in proteins. Biophysical Journal,2001, 80(2): 952–960.
  • 9Julenius K, M?lgaard A, Gupta R, et al. Prediction, conservation analysis, and structural characterization of mammalian mucin-type O-glycosylation sites. Glycobiology,2005, 15(2): 153–164.
  • 10Haltiwanger R S, Lowe J B. Role of glycosylation in development. Annual Review of Biochemistry,2004, 73(1): 491–537.

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部