期刊文献+

基于组蛋白修饰数据预测基因差异性表达的深度融合模型

Deep fusion model for predicting differential gene expression by histone modification data
下载PDF
导出
摘要 针对使用大规模组蛋白修饰(HM)数据预测基因差异性表达(DGE)时未合理利用细胞型特异性(CS)和细胞型间异同两类信息,且输入规模大、计算量高等问题,提出一种深度学习方法dcsDiff。首先,使用多个自编码器(AE)和双向长短时记忆(Bi‑LSTM)网络降维,并建模HM信号得到嵌入表示;然后,利用多个卷积神经网络(CNN)分别挖掘每类CS的HM组合效应以及两细胞型间每种HM的异同信息和所有HM的联合影响;最后,融合两类信息预测两细胞型间的DGE。在对REMC数据库中10对细胞型的实验中,与DeepDiff相比,dcsDiff的预测DGE的皮尔逊相关系数(PCC)最高提升了7.2%、平均提升了3.9%,准确检测出差异表达基因的数量最多增加了36、平均增加了17.6,运行时间节省了78.7%;进一步的成分分析实验证明了合理整合上述两类信息的有效性;并通过实验确定了算法的参数。实验结果表明dcsDiff能有效提高DGE预测的效率。 Concering the problem that the Cell type‑Specificity(CS)and similarity and difference information between different cell types are not properly used when predicting Differential Gene Expression(DGE)with large‑scale Histone Modification(HM)data,as well as large volume of input and high computational cost,a deep learning‑based method named dcsDiff was proposed.Firstly,multiple AutoEncoders(AEs)and Bi‑directional Long Short‑Term Memory(Bi‑LSTM)networks were introduced to reduce the dimensionality of HM signals and model them to obtain the embedded representation.Then,multiple Convolutional Neural Networks(CNNs)were used to mine the HM combined effects in each single cell type,and the similarity and difference information of each HM and joint effects of all HMs between two cell types.Finally,the two kinds of information were fused to predict DGE between two cell types.In the comparison experiments with DeepDiff on 10 pairs of cell types in the REMC(Roadmap Epigenomics Mapping Consortium)database,the Pearson Correlation Coefficient(PCC)of dcsDiff in DGE prediction was increased by 7.2%at the highest and 3.9%on average,the number of differentially expressed genes accurately detected by dcsDiff was increased by 36 at most and 17.6 on average,and the running time of dcsDiff was saved by 78.7%.The validity of reasonable integration of the above two kinds of information was proved in the component analysis experiment.The parameters of dcsDiff were also determined by experiments.Experimental results show that the proposed dcsDiff can effectively improve the efficiency of DGE prediction.
作者 李昕 贾韬 LI Xin;JIA Tao(College of Computer and Information Science,Southwest University,Chongqing 400715,China)
出处 《计算机应用》 CSCD 北大核心 2022年第11期3404-3412,共9页 journal of Computer Applications
基金 教育部中国高校产学研创新基金资助项目(2021ALA03016)。
关键词 组蛋白修饰 基因差异性表达 细胞型特异性 自编码器 双向长短时记忆网络 信息融合 表观遗传学 Histone Modification(HM) Differential Gene Expression(DGE) Cell type‑Specificity(CS) AutoEncoder(AE) Bi‑directional Long Short‑Term Memory(Bi‑LSTM)network information fusion epigenetics
  • 相关文献

参考文献7

二级参考文献79

  • 1LAN Fei1 & SHI Yang 2 1 Department of Biology, Constellation Pharmaceuticals, 148 Sidney Street, Cambridge, MA 02139, USA 2 Department of Pathology, Harvard Medical School, 77 Ave Louise Pasteur, Boston MA, 02115, USA.Epigenetic regulation: methylation of histone and non-histone proteins[J].Science China(Life Sciences),2009,52(4):311-322. 被引量:27
  • 2[1]Brown P O,Botstein D.Exploring the new world of the genome with DNA microarrays.Nature Genetics,1999,21(1):33-37
  • 3[2]Jain A K,Murty M N,Flynn P J.Data clustering:a review.ACM Computing Surveys,1999,31(3):264-323
  • 4[3]Schena M,Shalon D,Davis R W,Brown P O.Quantitative monitoring of gene expression patterns with a complementary DNA microarray.Science,1999,270(5235):467-470
  • 5[4]Schena M,Scalon D,Heller R.Parallel human genome analysis:microarray-based expression monitoring of 1000 genes.Proceedings of the National Academy of Sciences of the United States of America,1996,93(20):10614-10619
  • 6[5]Ramsay G.DNA chips:state-of-the art.Nature Biotechnology,1998,16(1):40-44
  • 7[6]Lockhart D J,Dong H,Byrne M C,Follettie M T,Gallo M V,Chee M S.Expression monitoring by hybridization to high-density oligonucleotide arrays.Nature Biotechnology,1996,14(13):1675-1680
  • 8[7]Lipshutz R J,Fodor S P,Gingeras T R,Lockhart D J.High density synthetic oligonucleotide arrays.Nature Genetics,1999,21(1):20-24
  • 9[8]Harrington C A,Rosenow C,Retief J.Monitoring gene expression using DNA microarrays.Current Opinion in Microbiology,2000,3(3):285-291
  • 10[9]Jiang D X,Pei J,Zhang A D.An interactive approach to mining gene expression data.IEEE Transactions on Knowledge and Data Engineering,2005,17(10):1363-1378

共引文献259

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部