摘要
DNA复制发生在所有生物体内,是生物遗传的基础,它是从单个原始的DNA分子生成两个相同复制品的过程。为了深入了解这一重要的生物学过程并将它应用于发展对抗遗传病的新战略,有必要对DNA复制的机制进行研究。在后基因组时代,随着DNA序列数据的数量呈爆炸式的增长,急需发展高通量数据比对的工具,此工具能够通过DNA序列数据即可识别DNA序列中的复制起始位点。文章中提出一个新型的预测器iROI-PCM,将DNA序列样本通过结合一系列自协方差和交叉协方差的物理化学属性矩阵来表示,并使用支持向量机进行分类。经过严格的交叉验证,结果表明,所提出的预测器在敏感性、特异性、准确性、稳定性等指标上都明显优于已有的预测器,能在一定程度上对相关研究有所助益。
DNA replication occurs in all organisms,is the basis of biological inheritance,which is the process of generating two identical copies from a single original DNA molecule.In order to have a thorough understanding of this important biological process and then apply it to the development of the new strategy against genetic disorders,it is necessary to study the mechanism of DNA replication.In the post-genomic era,with the explosive growth of DNA sequence data,there is an urgent need to develop high-throughput data alignment tool that can identify DNA replication origin purely based on the sequence information.In the paper,a new predictor called iROI-PCM was proposed to represent the physicochemical attribute matrix of DNA sequence samples by combining a series of autocovariance and cross covariance,and the support vector machine is used for classification.Through strict cross validation,the results show that the proposed predictor is significantly better than the existing predictor in sensitivity,specificity,accuracy,and stability indexes,which can be helpful for relevant research to a certain extent.
作者
叶寒晓
YE Hanxiao(School of Mathematics and Information Science,Nanchang Normal University,Nanchang 330032,China;School of Statistics and Data Science,Jiangxi University of Finance and Economics,Nanchang 330013,China)
出处
《景德镇学院学报》
2024年第3期18-22,33,共6页
Journal of JingDeZhen University
基金
江西省教育厅科学技术研究项目(GJJ202623)
南昌师范学院科研项目(20KJYB07)。
关键词
复制起始位点
物理化学属性
支持向量机
交叉验证
replication origin
physicochemical attribute
support vector machine(SVM)
cross validation