期刊文献+

基于双重索引矩阵的蛋白质功能预测 被引量:1

Protein function prediction based on doubly indexed matrix
下载PDF
导出
摘要 针对单一数据源预测蛋白质功能效果不佳以及蛋白质相互作用网络信息不完全等问题,提出一种多数据源融合和基于双重索引矩阵的随机游走的蛋白质功能预测(MSI-RWDIM)算法。该算法使用了蛋白质序列、基因表达和蛋白质相互作用数据预测蛋白质功能,并根据这些数据源特性构建相应的相互作用加权网络;然后融合各数据源加权网络并结合功能相关性网络构建双重索引矩阵,使用随机游走算法计算得分进而预测蛋白质功能。在酵母数据集的五折交叉验证中,MSI-RWDIM算法具有较高的准确率和较低的覆盖率,还可降低功能标签损失率。研究结果表明,MSI-RWDIM算法的总体性能优于常用的k-近邻、直推式多标签集成分类和快速同步加权方法。 The single data source cannot effectively predict the function of protein and the information of protein interaction network is incomplete. In order to solve the problem, A Multi-Source Integration and Random Walk with Doubly Indexed Matrix (MSI-RWDIM) algorithm was proposed. The proposed algorithm used protein sequence, gene expression and protein-protein interaction for the prediction of protein function. The weighting networks were constructed from the data sources with their characteristics. A network, which was fused by the weighting networks, integrated with function correlation network to construct a doubly indexed matrix. Random walk was used to calculate annotation scores and predict protein function. The cross-validation experiments on Yeast show that MSI-RWDIM can achieve higher prediction accuracy, lower coverage and lower loss rate of function labels. The research results show that the overall performance of MSI-RWDIM is much better than commonly used k-nearest neighbor, transduetive multi-label ensemble classifier and fast simultaneous weighting method.
作者 孟军 张信
出处 《计算机应用》 CSCD 北大核心 2015年第6期1637-1642,共6页 journal of Computer Applications
基金 国家自然科学基金资助项目(61472061)
关键词 多数据源融合 随机游走 双重索引矩阵 功能相关性网络 蛋白质功能预测 multiple data integration random walk doubly indexed matrix function correlation network protein function prediction
  • 相关文献

参考文献28

  • 1HAWKINS T,CHITALE M,LUBAN S,et al.PFP:Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data[J].Proteins:Structure,Function,and Bioinformatics,2009,74(3):566-582.
  • 2PUELMA T,GUTIERREZ R A,SOTO A.Discriminative local subspaces in gene expression data for effective gene function prediction[J].Bioinformatics,2012,28(17):2256-2264.
  • 3MOOSAVI S,RAHGOZAR M,RAHIMI A.Protein function prediction using neighbor relativity in protein-protein interaction network[J].Computational Biology and Chemistry,2013,43:11-16.
  • 4TROYANSKAYA O G,DOLINSKI K,OWEN A B,et al.A Bayesian framework for combining heterogeneous data sources for gene function prediction(in Saccharomyces cerevisiae)[J].Proceedings of the National Academy of Sciences of the United States of America,2003,100(14):8348-8353.
  • 5BARUTCUOGLU Z,SCHAPIRE R E,TROYANSKAYA O G.Hierarchical multi-label prediction of gene function[J].Bioinformatics,2006,22(7):830-836.
  • 6LAN L,DJURIC N,GUO Y,et al.MS-kNN:protein function prediction by integrating multiple data sources[J].BMC Bioinformatics,2013,14(Suppl 3):S8.
  • 7ZHANG X-F,DAI D-Q.A framework for incorporating functional interrelationships into protein function prediction algorithms[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2012,9(3):740-753.
  • 8JIANG J Q.Learning protein functions from bi-relational graph of proteins and function annotations[C]//WABI 2011:Proceedings of the 11th International Workshop,LNCS 6833.Berlin:Springer-Verlag,2011:128-138.
  • 9HU P,JIANG H,EMILI A.Predicting protein functions by relaxation labelling protein interaction network[J].BMC Bioinformatics,2010,11(Suppl 1):S64.
  • 10CHOU K-C.Pseudo amino acid composition and its applications in bioinformatics,proteomics and system biology[J].Current Proteomics,2009,6(4):262-274.

二级参考文献6

同被引文献20

  • 1Hawkins T, Chitale M, Luban S, et al. PFP: Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data. Proteins-structure Function & Bioinformatics, 2009, 74(3): 566-582.
  • 2Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. Journal of Molecular Biology, 1990, 215(3): 403-410.
  • 3Pearson WR. Rapid and sensitive sequence comparison with FASTP and FASTA. Methods in Enzymology, 1990, 183(1): 63-98.
  • 4Altschul SF, Madden TL, Schaiffer AA, et al. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research, 1997, 25(8): 3389-3402.
  • 5Bujnicki J. Sequence permutations in the molecular evolution of dna methyl transferases. BMC Evolutionary Biology, 2: 3, 2002.
  • 6Cunningham BA, Hemperly J J, Hopp TP, Edelman GM. Favin versus concanavalin A: Circularly permuted amino acid sequences. Proc. Natl. Aead. Sci. USA, 1979, 76(7): 3218-3222.
  • 7Daly NL, Craik DJ. Acyclic permutants of naturally occurring cyclic proteins. The Journal of Biological Chemistry, 2000, 275(25): 3218-22.
  • 8Craik DJ. Seamless proteins tie up their loose ends. Science, 2006, 311: 1563-1564.
  • 9Craik DJ. Circling the enemy: Cyclic proteins in plant defence. Trends Plant Sci., 2009, 14(6): 328-335.
  • 10Weiner J, Bomberg-Bauer E. Evolution of circular permutations in multidomain proteins. Mol. Biol. Evol, 2006, 23(4): 734-743.

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部