Enhancer promoter interaction(EPI)involves most of gene transcriptional regulation in the high eukaryotes.Predicting the EPIs from given genomic loci or DNA sequences is not a trivial task.The benchmarking work so far...Enhancer promoter interaction(EPI)involves most of gene transcriptional regulation in the high eukaryotes.Predicting the EPIs from given genomic loci or DNA sequences is not a trivial task.The benchmarking work so far for EPI predictors is more or less empirical and lacks quantitative model-based comparisons,posing challenges for molecular biologists to obtain reliable EPI predictions.Here,we present an EPI prediction platform,namely Delta.EPI.Based on a statistic model of the data integration,Delta.EPI is capable of comprehensively assessing the predictions from four state-of-the-art EPI predictors.Equipped with a userfriendly interface and visualization platform,Delta.EPI presents the sorted results with the confidence of EPI relevance,which may guide the molecular biologists who lack the pre-knowledge of the algorithms of EPI prediction.Last,we showcase the utility of Delta.EPI with a case study.Delta.EPI provides a powerful tool to fuel the gene regulation and 3D genome studies by ease-to-access EPI predictions.Delta.EPI can be freely accessed at https://ngdc.cncb.ac.cn/deltaEPI/.展开更多
Background:As parts of the cis-regulatory mechanism of the human genome,interactions between distal enhancers and proximal promoters play a crucial role.Enhancers,promoters,and enhancer-promoter interactions(EPIs)can ...Background:As parts of the cis-regulatory mechanism of the human genome,interactions between distal enhancers and proximal promoters play a crucial role.Enhancers,promoters,and enhancer-promoter interactions(EPIs)can be detected using many sequencing technologies and computation models.However,a systematic review that summarizes these EPI identification methods and that can help researchers apply and optimize them is still needed.Results:In this review,we first emphasize the role of EPIs in regulating gene expression and describe a generic framework for predicting enhancer-promoter interaction.Next,we review prediction methods for enhancers,promoters,loops,and enhancer-promoter interactions using different data features that have emerged since 2010,and we summarize the websites available for obtaining enhancers,promoters,and enhancer-promoter interaction datasets.Finally,we review the application of the methods for identifying EPIs in diseases such as cancer.Conclusions:The advance of computer technology has allowed traditional machine learning,and deep learning methods to be used to predict enhancer,promoter,and EPIs from genetic,genomic,and epigenomic features.In the past decade,models based on deep learning,especially transfer learning,have been proposed for directly predicting enhancer-promoter interactions from DNA sequences,and these models can reduce the parameter training time required of bioinformatics researchers.We believe this review can provide detailed research frameworks for researchers who are beginning to study enhancers,promoters,and their interactions.展开更多
Background:In the human genome,distal enhancers are involved in regulating target genes through proximal promoters by forming enhancer-promoter interactions.Although recently developed high-throughput experimental app...Background:In the human genome,distal enhancers are involved in regulating target genes through proximal promoters by forming enhancer-promoter interactions.Although recently developed high-throughput experimental approaches have allowed us to recognize potential enhancer-promoter interactions genome-wide,it is still largely unclear to what extent the sequence-level information encoded in our genome help guide such interactions.Methods:Here we report a new computational method (named "SPEID") using deep learning models to predict enhancer-promoter interactions based on sequence-based features only,when the locations of putative enhancers and promoters in a particular cell type are given.Results:Our results across six different cell types demonstrate that SPEID is effective in predicting enhancerpromoter interactions as compared to state-of-the-art methods that only use information from a single cell type.As a proof-of-principle,we also applied SPEID to identify somatic non-coding mutations in melanoma samples that may have reduced enhancer-promoter interactions in tumor genomes.Conclusions^ This work demonstrates that deep learning models can help reveal that sequence-based features alone are sufficient to reliably predict enhancer-promoter interactions genome-wide.展开更多
Enhancers modulate gene expression by interacting with promoters.Models of enhancer-promoter interactions(EPIs)in the literature involve the activity of many components,including transcription factors and nucleic acid...Enhancers modulate gene expression by interacting with promoters.Models of enhancer-promoter interactions(EPIs)in the literature involve the activity of many components,including transcription factors and nucleic acid.However,the role that sequence similarity plays in EPIs remains largely unexplored.Herein,we report that Alu-derived sequences dominate sequence similarity between enhancers and promoters.After rejecting alternative DNA:DNA and DNA:RNA triplex models,we propose that enhancer-associated RNAs(eRNAs)may directly contact their targeted promoters by forming trans-acting R-loops at those Alu sequences.We show how the characteristic distribution of functional genomic data,such as RNA-DNA proximate ligation reads,binding of transcription factors,and RNA-binding proteins,all align with the Alu sequences of EPIs.We also show that these aligned Alu sequences may be subject to the constraint of coevolution,further implying the functional significance of these R-loop hybrids.Finally,our results imply that eRNA and Alu elements associate in a manner previously unrecognized in EPIs and the evolution of gene regulation networks in mammals.展开更多
基金Special Investigation on Science and Technology Basic Resources of MOST,China(2019FY100102)the Science and Technology Innovation 2030-Major Project(2022ZD04017)+2 种基金the National Key R&D Program of China(2018YFC2000400)the National Natural Science Foundation of China(31871331,31671342,91940304)the Beijing Natural Science Foundation(Z200021).
文摘Enhancer promoter interaction(EPI)involves most of gene transcriptional regulation in the high eukaryotes.Predicting the EPIs from given genomic loci or DNA sequences is not a trivial task.The benchmarking work so far for EPI predictors is more or less empirical and lacks quantitative model-based comparisons,posing challenges for molecular biologists to obtain reliable EPI predictions.Here,we present an EPI prediction platform,namely Delta.EPI.Based on a statistic model of the data integration,Delta.EPI is capable of comprehensively assessing the predictions from four state-of-the-art EPI predictors.Equipped with a userfriendly interface and visualization platform,Delta.EPI presents the sorted results with the confidence of EPI relevance,which may guide the molecular biologists who lack the pre-knowledge of the algorithms of EPI prediction.Last,we showcase the utility of Delta.EPI with a case study.Delta.EPI provides a powerful tool to fuel the gene regulation and 3D genome studies by ease-to-access EPI predictions.Delta.EPI can be freely accessed at https://ngdc.cncb.ac.cn/deltaEPI/.
基金This study was funded by grants from the Foshan Higher Education Foundation(No.BKBS202203)the National Key R&D Program of China(No.2018YFA0801402)+1 种基金the National Natural Science Foundation of China(No.61971031)the CAMS Innovation Fund for Medical Sciences(Nos.2021-RC310-007,2021-I2M-1-020 and 2022-I2M-1-020).
文摘Background:As parts of the cis-regulatory mechanism of the human genome,interactions between distal enhancers and proximal promoters play a crucial role.Enhancers,promoters,and enhancer-promoter interactions(EPIs)can be detected using many sequencing technologies and computation models.However,a systematic review that summarizes these EPI identification methods and that can help researchers apply and optimize them is still needed.Results:In this review,we first emphasize the role of EPIs in regulating gene expression and describe a generic framework for predicting enhancer-promoter interaction.Next,we review prediction methods for enhancers,promoters,loops,and enhancer-promoter interactions using different data features that have emerged since 2010,and we summarize the websites available for obtaining enhancers,promoters,and enhancer-promoter interaction datasets.Finally,we review the application of the methods for identifying EPIs in diseases such as cancer.Conclusions:The advance of computer technology has allowed traditional machine learning,and deep learning methods to be used to predict enhancer,promoter,and EPIs from genetic,genomic,and epigenomic features.In the past decade,models based on deep learning,especially transfer learning,have been proposed for directly predicting enhancer-promoter interactions from DNA sequences,and these models can reduce the parameter training time required of bioinformatics researchers.We believe this review can provide detailed research frameworks for researchers who are beginning to study enhancers,promoters,and their interactions.
基金the National Science Foundation (1252522 to Shashank Singh,1054309 and 1262575 to Jian Ma)the National Institutes of Health (HG007352 and DK107965 to Jian Ma).
文摘Background:In the human genome,distal enhancers are involved in regulating target genes through proximal promoters by forming enhancer-promoter interactions.Although recently developed high-throughput experimental approaches have allowed us to recognize potential enhancer-promoter interactions genome-wide,it is still largely unclear to what extent the sequence-level information encoded in our genome help guide such interactions.Methods:Here we report a new computational method (named "SPEID") using deep learning models to predict enhancer-promoter interactions based on sequence-based features only,when the locations of putative enhancers and promoters in a particular cell type are given.Results:Our results across six different cell types demonstrate that SPEID is effective in predicting enhancerpromoter interactions as compared to state-of-the-art methods that only use information from a single cell type.As a proof-of-principle,we also applied SPEID to identify somatic non-coding mutations in melanoma samples that may have reduced enhancer-promoter interactions in tumor genomes.Conclusions^ This work demonstrates that deep learning models can help reveal that sequence-based features alone are sufficient to reliably predict enhancer-promoter interactions genome-wide.
基金the National Natural Science Foundation of China of China(91940304,31871331,31671342)Beijing Natural Science Foundation(Z200021)+2 种基金Special Investigation on Science and Technology Basic Resources of MOST,China(2019FY100102)the National Key R&D Program of China(2018YFC2000400)the Beijing Advanced Discipline Fund(115200S001)。
文摘Enhancers modulate gene expression by interacting with promoters.Models of enhancer-promoter interactions(EPIs)in the literature involve the activity of many components,including transcription factors and nucleic acid.However,the role that sequence similarity plays in EPIs remains largely unexplored.Herein,we report that Alu-derived sequences dominate sequence similarity between enhancers and promoters.After rejecting alternative DNA:DNA and DNA:RNA triplex models,we propose that enhancer-associated RNAs(eRNAs)may directly contact their targeted promoters by forming trans-acting R-loops at those Alu sequences.We show how the characteristic distribution of functional genomic data,such as RNA-DNA proximate ligation reads,binding of transcription factors,and RNA-binding proteins,all align with the Alu sequences of EPIs.We also show that these aligned Alu sequences may be subject to the constraint of coevolution,further implying the functional significance of these R-loop hybrids.Finally,our results imply that eRNA and Alu elements associate in a manner previously unrecognized in EPIs and the evolution of gene regulation networks in mammals.