Essential ncRNA is a type of ncRNAwhich is indispensable for the sur-vival of organisms.Although essential ncRNAs cannot encode proteins,they are as important as essential coding genes in biology.They have got wide va...Essential ncRNA is a type of ncRNAwhich is indispensable for the sur-vival of organisms.Although essential ncRNAs cannot encode proteins,they are as important as essential coding genes in biology.They have got wide variety of applications such as antimicrobial target discovery,minimal genome construction and evolution analysis.At present,the number of species required for the deter-mination of essential ncRNAs in the whole genome scale is still very few due to the traditional methods are time-consuming,laborious and costly.In addition,tra-ditional experimental methods are limited by the organisms as less than 1%of bacteria can be cultured in the laboratory.Therefore,it is important and necessary to develop theories and methods for the recognition of essential non-coding RNA.In this paper,we present a novel method for predicting essential ncRNA by using both compositional and derivative features calculated by information theory of ncRNA sequences.The method was developed with Support Vector Machine(SVM).The accuracy of the method was evaluated through cross-species cross-vali-dation and found to be between 0.69 and 0.81.It shows that the features we selected have good performance for the prediction of essential ncRNA using SVM.Thus,the method can be applied for discovering essential ncRNAs in bacteria.展开更多
基金This study was jointly funded by the National Natural Science Foundation of China(61803112,32160151)the Science and Technology Foundation of Guizhou Province(2019-2811).
文摘Essential ncRNA is a type of ncRNAwhich is indispensable for the sur-vival of organisms.Although essential ncRNAs cannot encode proteins,they are as important as essential coding genes in biology.They have got wide variety of applications such as antimicrobial target discovery,minimal genome construction and evolution analysis.At present,the number of species required for the deter-mination of essential ncRNAs in the whole genome scale is still very few due to the traditional methods are time-consuming,laborious and costly.In addition,tra-ditional experimental methods are limited by the organisms as less than 1%of bacteria can be cultured in the laboratory.Therefore,it is important and necessary to develop theories and methods for the recognition of essential non-coding RNA.In this paper,we present a novel method for predicting essential ncRNA by using both compositional and derivative features calculated by information theory of ncRNA sequences.The method was developed with Support Vector Machine(SVM).The accuracy of the method was evaluated through cross-species cross-vali-dation and found to be between 0.69 and 0.81.It shows that the features we selected have good performance for the prediction of essential ncRNA using SVM.Thus,the method can be applied for discovering essential ncRNAs in bacteria.