期刊文献+

一种可扩展半径的RNA二级结构密度聚类算法

Density-based Clustering with Extensible Radius for RNA Secondary Structure Prediction
下载PDF
导出
摘要 基于自由能模型预测RNA二级结构时,真实结构可能存在于高于最小自由能一定范围内的次优结构集合中.通过对RNA次优结构集合聚类,选取代表性的结构,可以提高RNA二级结构预测的准确率.针对可变密度的RNA二级结构数据集合,提出了一种可扩展半径的密度聚类算法.算法利用特征选择方法对特征集合进行筛选,选取与聚类相关度较高的特征子集,降低聚类空间的维度.聚类过程,以最大密度对象作为簇的初始聚类中心,根据簇内的密度分布情况和密度变化参数更新簇的半径,直到簇扩展完成.实验表明,该算法可以识别并处理变密度簇,能够有效地聚类RNA二级结构. Prediction of RNA secondary structure based on free energy model produces the problem that the true structure may be a suboptimal structure w ithin an energy increment above the minimum free energy. The accuracy of the true RNA structure prediction can be improved through grouping suboptimal structures into a small number of clusters and computing a representative structure for each cluster. In this paper,a density-based clustering algorithm w ith extensible radius dubbed ' ER-DBSCAN' is presented to handle the RNA dataset having variable density. Our method firstly adopts feature selection algorithm based on the consensus matrix to filter the feature set and select the features having the high correlation w ith clustering analysis to reduce the dimension of the clustering space.Next,the clustering module ER-DBSCAN starts w ith the maximum density object as the starting point of a new cluster,and adjusts the radius of the cluster based on the density distribution and density variation during cluster expansion. Our results indicate that ER-DBSCAN can detect and handle clusters of varying density,and cluster RNA secondary structures effectively.
出处 《小型微型计算机系统》 CSCD 北大核心 2015年第9期1968-1972,共5页 Journal of Chinese Computer Systems
关键词 RNA二级结构 次优结构 密度聚类算法 特征选择 RNA secondary structure suboptimal structure density-based clustering algorithm feature selection
  • 相关文献

参考文献17

  • 1Dethoff E A, Chugh J, Mustoe A M, et al. Functional complexity and regulation through RNA dynamics [ J ]. Nature, 2012, 482 (7385) :322-330.
  • 2Walter A E, Turner D H, Kim J, et al. Coaxial stacking of helixes enhances binding of oligoribonucleotides and improves predictions of RNA folding[J]. Proceedings of the National Academy of Sci-ences, 1994,91 (20) :9218-9222.
  • 3Mathews D H, Disney M D, Childs J L, et al. Incorporating chemi- cal modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure [ J ]. Proceedings of the National Academy of Sciences of the United States of America, 2004,101 (19) :7287-7292.
  • 4Hofacker I L, Lorenz R. Predicting RNA structure:advances and limitations [ M ]. RNA Folding, Humana Press,2014 : 1086 : 1-19.
  • 5Yehdego D T,Zhang B, Kodimala V K R,et al. Secondary structure predictions for long RNA sequences based on inversion excursions and MapReduce[ C]. Parallel and Distributed Processing Symposi- um Workshops & PhD Forum (IPDPSW) ,2013 IEEE 27 th Interna- tional,/EEE ,2013 : 109:520-529.
  • 6Zuker M. On finding all suboptimal foldings of an RNA molecule [ J ]. Science, 1989,244 (4900) :48-52.
  • 7Ray S S, Pal S K. RNA secondary structure prediction using soft computing [ J ]. Computational Biology and Bioinformatics, IEEE/ ACM Transactions on,2013,10( 1 ) :2-17.
  • 8Bernhart S H, Hofacker I L, Stadler P F. Local RNA base pairing probabilities in large sequences [ J ]. Bioinformatics, 2006,22 ( 5 ) .. 614-615.
  • 9Ding Y, Lawrence C E. A statistical sampling algorithm for RNA secondary structure prediction [ J ]. Nucleic Acids Research, 2003, 31 (24) : 7280-7301.
  • 10Crozier S P, Garner H R. Ensemble-based RNA secondary structure characterization (Engineering in Genomics ) [ J ]. Engineering in Medicine and Biology Magazine, IEEE, 2007,26 ( 1 ) :72 -86.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部