一种可扩展半径的RNA二级结构密度聚类算法

Density-based Clustering with Extensible Radius for RNA Secondary Structure Prediction

下载PDF

导出

摘要基于自由能模型预测RNA二级结构时,真实结构可能存在于高于最小自由能一定范围内的次优结构集合中.通过对RNA次优结构集合聚类,选取代表性的结构,可以提高RNA二级结构预测的准确率.针对可变密度的RNA二级结构数据集合,提出了一种可扩展半径的密度聚类算法.算法利用特征选择方法对特征集合进行筛选,选取与聚类相关度较高的特征子集,降低聚类空间的维度.聚类过程,以最大密度对象作为簇的初始聚类中心,根据簇内的密度分布情况和密度变化参数更新簇的半径,直到簇扩展完成.实验表明,该算法可以识别并处理变密度簇,能够有效地聚类RNA二级结构. Prediction of RNA secondary structure based on free energy model produces the problem that the true structure may be a suboptimal structure w ithin an energy increment above the minimum free energy. The accuracy of the true RNA structure prediction can be improved through grouping suboptimal structures into a small number of clusters and computing a representative structure for each cluster. In this paper,a density-based clustering algorithm w ith extensible radius dubbed ＇ ER-DBSCAN＇ is presented to handle the RNA dataset having variable density. Our method firstly adopts feature selection algorithm based on the consensus matrix to filter the feature set and select the features having the high correlation w ith clustering analysis to reduce the dimension of the clustering space.Next,the clustering module ER-DBSCAN starts w ith the maximum density object as the starting point of a new cluster,and adjusts the radius of the cluster based on the density distribution and density variation during cluster expansion. Our results indicate that ER-DBSCAN can detect and handle clusters of varying density,and cluster RNA secondary structures effectively.

作者王常武王秀芹魏真真王宝文刘文远李永强

机构地区燕山大学信息科学与工程学院河北省计算机虚拟技术与系统集成重点实验室秦皇岛市第一医院

出处《小型微型计算机系统》 CSCD 北大核心 2015年第9期1968-1972,共5页 Journal of Chinese Computer Systems

关键词 RNA二级结构次优结构密度聚类算法特征选择 RNA secondary structure suboptimal structure density-based clustering algorithm feature selection

分类号 TP301 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献17

1Dethoff E A, Chugh J, Mustoe A M, et al. Functional complexity and regulation through RNA dynamics [ J ]. Nature, 2012, 482 (7385) :322-330.
2Walter A E, Turner D H, Kim J, et al. Coaxial stacking of helixes enhances binding of oligoribonucleotides and improves predictions of RNA folding[J]. Proceedings of the National Academy of Sci-ences, 1994,91 (20) :9218-9222.
3Mathews D H, Disney M D, Childs J L, et al. Incorporating chemi- cal modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure [ J ]. Proceedings of the National Academy of Sciences of the United States of America, 2004,101 (19) :7287-7292.
4Hofacker I L, Lorenz R. Predicting RNA structure:advances and limitations [ M ]. RNA Folding, Humana Press,2014 : 1086 : 1-19.
5Yehdego D T,Zhang B, Kodimala V K R,et al. Secondary structure predictions for long RNA sequences based on inversion excursions and MapReduce[ C]. Parallel and Distributed Processing Symposi- um Workshops & PhD Forum (IPDPSW) ,2013 IEEE 27 th Interna- tional,/EEE ,2013 : 109:520-529.
6Zuker M. On finding all suboptimal foldings of an RNA molecule [ J ]. Science, 1989,244 (4900) :48-52.
7Ray S S, Pal S K. RNA secondary structure prediction using soft computing [ J ]. Computational Biology and Bioinformatics, IEEE/ ACM Transactions on,2013,10( 1 ) :2-17.
8Bernhart S H, Hofacker I L, Stadler P F. Local RNA base pairing probabilities in large sequences [ J ]. Bioinformatics, 2006,22 ( 5 ) .. 614-615.
9Ding Y, Lawrence C E. A statistical sampling algorithm for RNA secondary structure prediction [ J ]. Nucleic Acids Research, 2003, 31 (24) : 7280-7301.
10Crozier S P, Garner H R. Ensemble-based RNA secondary structure characterization (Engineering in Genomics ) [ J ]. Engineering in Medicine and Biology Magazine, IEEE, 2007,26 ( 1 ) :72 -86.

1王晶,夏鲁宁,荆继武.一种基于密度最大值的聚类算法[J].中国科学院研究生院学报,2009,26(4):539-548. 被引量：13
2孙志伟,赵政.DBSCAN在非空间属性处理上的扩展[J].计算机应用,2005,25(6):1379-1381. 被引量：4
3吴新根,罗立民,鲍旭东,严玉龙,傅瑶.一种基于Hopfield网络的MRI图像分割方法[J].电子科技导报,1998(12):24-26. 被引量：1
4殷脂,叶春明,温蜜.最小自由能约束的DNA编码设计研究[J].计算机工程与应用,2010,46(12):25-27. 被引量：3
5刘元宁,艾露露,段云娜,李誌,田明尧,张浩.基于局部结构交互的RNA假结预测[J].吉林大学学报（工学版）,2015,45(2):613-618.
6风格独特试用Phottix可变密度中灰滤镜[J].数码摄影,2012(6):136-136.
7夏飞,朱强华,金国庆.基于CPU-GPU混合计算平台的RNA二级结构预测算法并行化研究[J].国防科技大学学报,2013,35(6):138-146. 被引量：5
8陈飞,郝福珍.基于CUDA对RNA二级结构预测的并行研究[J].计算机工程与设计,2014,35(1):297-302. 被引量：1
9郑明,钟诚.CPU/GPU系统上存储高效的RNA二级结构预测算法[J].小型微型计算机系统,2014,35(5):1080-1084. 被引量：2
10陈自郁,何中市,何静媛.预测RNA二级结构离散粒子群优化算法[J].深圳大学学报（理工版）,2009,26(3):272-277. 被引量：3

小型微型计算机系统

2015年第9期

浏览历史

内容加载中请稍等...

一种可扩展半径的RNA二级结构密度聚类算法

参考文献17

相关作者

相关机构

相关主题

浏览历史