期刊文献+

Gene finding by integrating gene finders

Gene finding by integrating gene finders
下载PDF
导出
摘要 Gene finding, the accurate annotation of genomic DNA, has become one of the central topics in biological research. Although various computational methods (gene finders) have been proposed and developed, they all have their own limitations in gene findings. In this paper, we introduce an integrating gene finder, which combines the results of several existing gene finders together, to improve the accuracy of gene finding. Four integration schemes, based on majority voting, are developed for the analysis of two datasets – the basic dataset and the testing dataset. The basic dataset consists of 1500 DNA sequences and the testing dataset consists of 103 DNA sequences. It is demonstrated that a simple integration (a simple voting for each nucleotide) can significantly improve the finding performance, and removing confusing gene finders, caused by poor performance or redundant results, is important for a further improvement of the integration. The best prediction results are obtained using weighted majority voting, aided by the mRMR (Minimum Redundancy Maximum Relevance) (Peng, 2005) method for the gene finder selection. The prediction accuracies are 84.16% and 90.06% for the basic dataset and testing dataset respectively, which are better than any individual gene finding software in our research. Gene finding, the accurate annotation of genomic DNA, has become one of the central topics in biological research. Although various computational methods (gene finders) have been proposed and developed, they all have their own limitations in gene findings. In this paper, we introduce an integrating gene finder, which combines the results of several existing gene finders together, to improve the accuracy of gene finding. Four integration schemes, based on majority voting, are developed for the analysis of two datasets – the basic dataset and the testing dataset. The basic dataset consists of 1500 DNA sequences and the testing dataset consists of 103 DNA sequences. It is demonstrated that a simple integration (a simple voting for each nucleotide) can significantly improve the finding performance, and removing confusing gene finders, caused by poor performance or redundant results, is important for a further improvement of the integration. The best prediction results are obtained using weighted majority voting, aided by the mRMR (Minimum Redundancy Maximum Relevance) (Peng, 2005) method for the gene finder selection. The prediction accuracies are 84.16% and 90.06% for the basic dataset and testing dataset respectively, which are better than any individual gene finding software in our research.
机构地区 不详
出处 《Journal of Biomedical Science and Engineering》 2010年第11期1061-1068,共8页 生物医学工程(英文)
关键词 GENE Finding Intergration mRMR Gene Finding Intergration mRMR
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部