RNA secondary structure has become the most exploitable feature for ab initio detection of non-coding RNA(nc RNA) genes from genome sequences. Previous work has used Minimum Free Energy(MFE) based methods develope...RNA secondary structure has become the most exploitable feature for ab initio detection of non-coding RNA(nc RNA) genes from genome sequences. Previous work has used Minimum Free Energy(MFE) based methods developed to identify nc RNAs by measuring sequence fold stability and certainty. However, these methods yielded variable performances across different nc RNA species. Designing novel reliable structural measures will help to develop effective nc RNA gene finding tools. This paper introduces a new RNA structural measure based on a novel RNA secondary structure ensemble constrained by characteristics of native RNA tertiary structures. The new method makes it possible to achieve a performance leap from the previous structure-based methods. Test results on standard nc RNA datasets(benchmarks) demonstrate that this method can effectively separate most nc RNAs families from genome backgrounds.展开更多
基金supported in part by NSF MRI 0821263NIH BISTI R01GM072080-01A1 grant+1 种基金NIH ARRA Administrative Supplement to NIH BISTI R01GM072080-01A1NSF IIS grant of award No 0916250
文摘RNA secondary structure has become the most exploitable feature for ab initio detection of non-coding RNA(nc RNA) genes from genome sequences. Previous work has used Minimum Free Energy(MFE) based methods developed to identify nc RNAs by measuring sequence fold stability and certainty. However, these methods yielded variable performances across different nc RNA species. Designing novel reliable structural measures will help to develop effective nc RNA gene finding tools. This paper introduces a new RNA structural measure based on a novel RNA secondary structure ensemble constrained by characteristics of native RNA tertiary structures. The new method makes it possible to achieve a performance leap from the previous structure-based methods. Test results on standard nc RNA datasets(benchmarks) demonstrate that this method can effectively separate most nc RNAs families from genome backgrounds.