摘要
规律成簇的间隔短回文重复序列/CRISPR相关蛋白9(clustered regularly interspaced palindromic repeats/CRISPR-associated protein 9,CRISPR/Cas9)是新一代基因编辑技术,该技术依靠单向导RNA识别特定基因位点,并引导Cas9核酸酶对特定位点进行编辑。然而,该技术存在脱靶效应限制了其发展。近年来,运用深度学习辅助CRISPR/Cas9脱靶预测研究是一个新兴的思路,有助于研究者实现更高效安全的基因编辑和基因治疗。而现有的深度学习模型对脱靶预测的准确性仍有提高空间。为此,本文基于多尺度卷积神经网络提出CnnCRISPR模型预测CRISPR/Cas9的脱靶情况。首先,将向导RNA和DNA序列分别进行独热编码,再将两个二值矩阵按位进行或运算。其次,将编码后的序列输入基于Inception模块的网络进行训练和验证分析。最后,输出向导RNA和DNA序列对的脱靶情况。在公开数据集上的实验结果表明,CnnCRISPR模型的性能优于现有的深度学习脱靶预测模型,为脱靶问题的研究提供了有效且可行的方法。
Clustered regularly interspaced short palindromic repeat/CRISPR-associated protein 9(CRISPR/Cas9)is a new generation of gene editing technology,which relies on single guide RNA to identify specific gene sites and guide Cas9 nuclease to edit specific location in the genome.However,the off-target effect of this technology hampers its development.In recent years,several deep learning models have been developed for predicting the CRISPR/Cas9 off-target activity,which contributes to more efficient and safe gene editing and gene therapy.However,the prediction accuracy remains to be improved.In this paper,we proposed a multi-scale convolutional neural network-based method,designated as CnnCRISPR,for CRISPR/Cas9 off-target prediction.First,we used one-hot encoding method to encode the sgRNA-DNA sequence pair,followed by a bitwise or operation on the two binary matrices.Second,the encoded sequence was fed into the Inception-based network for training and evaluating.Third,the well-trained model was applied to evaluate the off-target situation of the sgRNA-DNA sequence pair.Experiments on public datasets showed CnnCRISPR outperforms existing deep learning-based methods,which provides an effective and feasible method for addressing the off-target problems.
作者
谢焕增
黄凌泽
罗烨
张桂珊
XIE Huanzeng;HUANG Lingze;LUO Ye;ZHANG Guishan(College of Engineering,Shantou University,Shantou 515063,Guangdong,China)
出处
《生物工程学报》
CAS
CSCD
北大核心
2024年第3期858-876,共19页
Chinese Journal of Biotechnology
基金
国家自然科学基金(62103249)
广东省基础与应用基础研究基金(2022A1515011720)
广东省科技专项资金“大专项+任务清单”(STKJ2021183)
汕头大学科研启动基金(NTF20032)。