摘要
microRNA(miRNA)是一类长度约为21nt的非编码RNA,具有重要的调控功能。miRNA前体包含一级序列特征和二级结构特征,其中含有冗余和无用的特征,这些特征无益于前体分类模型的分类准确度。因此需要去除冗余特征,进而降低特征维数并提高分类性能。针对miRNA的前体序列数据,已有特征选取方法,仅考虑了特征之间的区分距离。全面考虑了每个特征属性对分类的增益和特征间冗余性,选取的特征有助于建立高效的分类模型。实验结果表明,选取的特征子集有效地提高了miRNA前体分类器的预测性能,取得了更好的分类结果。
MicroRNAs(miRNAs) are a set of short(about 21nt) non-coding RNAs that have important regulation function.Pre-miRNAs have a lot of features based on primary sequences and secondary structure,some of which are redundant and useless for classification of pre-miRNAs.Therefore,the redundant features should be eliminated to decrease the feature dimension and improve the classification accuracy. In terms of pre-miRNAs,almost all the previous methods only consider the distance between two features.This paper considers information gain and feature redundancy.The selected features are useful for constructing efficient classification model.The experimental result indicates the selected feature subset could improve the prediction performance of pre-miRNA classification model and achieve better classification result.
出处
《智能计算机与应用》
2012年第6期1-3,10,共4页
Intelligent Computer and Applications
基金
国家自然科学基金(60932008
61172098
61271346)
高等学校博士学科点专项科研基金(20112302110040)
中央高校基本科研业务费专项资金(HIT.ICRST.2010 022)
黑龙江省自然科学基金项目(F201119)
黑龙江省教育厅科学技术研究项目(12521392
12511401)
哈尔滨市青年科技创新人才项目(2012RFQXS094)
黑龙江大学青年科学基金项目(QL201029)