期刊文献+

k-gram方法识别microRNA前体 被引量:4

A k-gram Approach for Identifying MicroRNA Precursors
下载PDF
导出
摘要 MicroRNAs(miRNAs)是动植物中较短的参与调控基因表达的功能性非编码RNA序列.第一个miRNA是通过实验手段发现的,然而通过实验手段识别miRNA在技术上仍然具有很大的挑战性和不完整性.因此,miRNA基因识别需要寻求计算方法来弥补实验方法的不足.提出了一个全新的miRNA前体的识别方法.在构造识别模型中,把初级序列和序列二级结构相结合,采用k-gram方法把序列信息映射到高维特征空间中,然后通过特征选取方法提取特征,并用这些特征为miRNA前体的识别构造了基于SVM的识别模型.同时,采用隐马尔可夫模型(HMM)的学习方法进行了比较.实验结果表明,该方法是有效的,可以达到较高的敏感性和特异性. MicroRNAs(miRNAs) are short non-coding RNAs that play important regulatory roles in both animals and plants. While the first miRNAs were discovered using experimental methods, experimental miRNA identification remains technically challenging and incomplete. Hence, computational approaches are a natural choice to complement experimental approaches to miRNA gene identification. A de novo miRNA precursor prediction method was proposed. In constructing the recognition model, both primary sequence and secondary structure were combined into an input sequence through encoding, and the input space was mapped into a feature space via κ-gram method. After applying feature selection, those selected features was used to construct SVM-based models for the recognition ofmiRNA precursors. In the mean time, the method was compared with the HMM learning method. Experimental results show that the method outperforms HMM. The reason is that microRNAs are so short that it is not easy for HMM model to capture the signals for differentiating the genuine microRNAs from those pseudo-microRNA genes. From features selected, it was found that they are mostly come from the primary and secondary structure of microRNAs. This phenomenon may tell us to put more efforts in the mieroRNAs themselves in designing computational method before we fully understand the transcription mechanism of microRNA biologically.
出处 《生物化学与生物物理进展》 SCIE CAS CSCD 北大核心 2007年第2期154-161,共8页 Progress In Biochemistry and Biophysics
基金 国家自然科学基金(30570425) 国家重点基础研究发展计划(2003CB715903)资助项目~~
关键词 MICRORNA 基因识别 支持向量机 隐马尔可夫模型 microRNA前体 microRNA, gene identification, support vector machine, hidden Markov model, microRNA precursor
  • 相关文献

参考文献27

  • 1Zaug A,Cech T.The intervening sequence RNA of Tetrahymena is an enzyme.Science,1986,231 (4737):470~475
  • 2Eddy S R.Non-codingRNA genes and the modern RNA world.Nat Rev Genet,2001,2 (12):919~929
  • 3Storz G.An expanding universe of noncoding RNAs.Science,2002,296 (5571):1260~1263
  • 4Mattick J S.Challenging the dogma:the hidden layer of non-protein-coding RNAs in complex organisms.Bioessays,2003,25 (10):930~939
  • 5Bartel D P.MicroRNAs:genomics,biogenesis,mechanism,and function.Cell,2004,116 (2):281 ~297
  • 6Abrahante J E,Daul A L,Li M,et al.The Caenorhabditis elegans hunchback-like gene lin-57/hbl-1 controls developmental time and is regulated by microRNAs.Dev Cell,2003,4 (5):625~637
  • 7Lai E C,Tomancak P,Williams R W,et al.Computational identification of Drosophila microRNA genes.Genome Biol,2003,4(7):R42
  • 8Lim L P,Lau N C,Weinstein E G,et al.The microRNAs of Caenorhabditis elegans.Genes Dev,2003,17 (8):991~1008
  • 9Griffiths-Jones S.The microRNA registry.Nucleic Acids Res,2004,32 (Database issue):D109~D111
  • 10Grad Y,Aaeh J,Hayes G D,et al.Computational and experimental identification of C.elegans microRNAs.Mol Cell,2003,11(5):1253~1263

同被引文献63

引证文献4

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部