期刊文献+

基于BERT和CNN的致病剪接突变预测方法

BERT and CNN-Based Deleterious Splicing Mutation Prediction Method
下载PDF
导出
摘要 遗传诊断中的一个关键挑战是评估与剪接相关的致病遗传突变.现有致病剪接突变预测工具大多基于传统的机器学习方法,主要依赖人工提取的剪接特征,从而限制预测性能的提升,尤其对于非经典剪接突变,性能较差.因此,文中提出基于BERT(Bidirectional Encoder Representations from Transformers)和CNN(Convolutional Neural Network)的致病剪接突变预测方法(BERT and CNN-Based Deleterious Splicing Mutation Prediction Method,BCsplice).BCsplice中BERT模块可全面提取序列的上下文信息,与提取局部特征的CNN结合后,可充分学习序列的语义信息,预测剪接突变致病性.非经典剪接突变的影响往往更依赖序列上下文的深层语义信息,通过CNN将BERT的多级别语义信息进行组合提取,可获得丰富的信息表示,有助于识别非经典剪接突变.对比实验表明BCsplice性能较优,尤其是在非经典剪接区表现出一定性能优势,有助于识别致病剪接突变和临床遗传诊断. A key challenge in genetic diagnosis is the assessment of pathogenic genetic mutations related to splicing.Existing predictive tools for pathogenic splicing mutations are mostly based on traditional machine learning methods,heavily relying on manually extracted splicing features.Thereby the predictive performance is limited,especially for non-canonical splicing mutation producing poor performance.Therefore,a bidirectional encoder representations from transformers(BERT)and convolutional neural network(CNN)-based deleterious splicing mutation prediction method(BCsplice)is proposed.The BERT module in BCsplice comprehensively extracts contextual information of sequences.While combined with CNN that extracts local features,BERT module can adequately learn the semantic information of sequences and predict the pathogenicity of splicing mutations.The impact of non-canonical splicing mutations often relies more on deep semantic information of sequence context.By combining and extracting the multi-level semantic information of BERT through CNN,rich information representations can be obtained,aiding in the identification of non-canonical splicing mutations.Comparative experiments demonstrate the superior performance of BCsplice,especially exhibiting certain performance advantages in non-canonical splicing regions,and it contributes to the identification of pathogenic splicing mutations and clinical genetic diagnosis.
作者 宋程程 赵依然 李晓艳 夏俊峰 SONG Chengcheng;ZHAO Yiran;LI Xiaoyan;XIA Junfeng(Institutes of Physical Science and Information Technology,Anhui University,Hefei 230601)
出处 《模式识别与人工智能》 EI CSCD 北大核心 2024年第2期181-190,共10页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金项目(No.U22A2038)资助。
关键词 致病剪接突变 深度学习 预测模型 致病性预测 Deleterious Splicing Mutation Deep Learning Prediction Model Pathogenicity Prediction
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部