Transcription factors (TFs) are key cellular components that control gene expression. They recognize specific DNA sequences, the TF binding sites (TFBSs), and thus are targeted to specific regions of the genome where ...Transcription factors (TFs) are key cellular components that control gene expression. They recognize specific DNA sequences, the TF binding sites (TFBSs), and thus are targeted to specific regions of the genome where they can recruit transcriptional co-factors and/or chromatin regulators to fine-tune spatiotemporal gene regulation. Therefore, the identification of TFBSs in genomic sequences and their subsequent quantitative modeling is of crucial importance for understanding and predicting gene expression. Here, we review how TFBSs can be determined experimentally, how the TFBS models can be constructed in silico, and how they can be optimized by taking into account features such as position interdependence within TFBSs, DNA shape, and/or by introducing state-of-the-art computational algorithms such as deep learning methods. In addition, we discuss the integration of context variables into the TFBS modeling, including nucleosome positioning, chromatin states, methylation patterns, 3D genome architectures, and TF cooperative binding, in order to better predict TF binding under cellular contexts. Finally, we explore the possibilities of combining the optimized TFBS model with technological advances, such as targeted TFBS perturbation by CRISPR, to better understand gene regulation, evolution, and plant diversity.展开更多
Transcription factor (TF) binding to its DNA target site plays an essential role in gene regulation. The location, orientation and spacing of transcription factor binding sites (TFBSs) also affect regulatory funct...Transcription factor (TF) binding to its DNA target site plays an essential role in gene regulation. The location, orientation and spacing of transcription factor binding sites (TFBSs) also affect regulatory function of the TF. However, how nucleosomal context of TFBSs influences TF binding and subsequent gene regulation remains to be elucidated. Using genome-wide nucleosome positioning and TF binding data in budding yeast, we found that binding affinities of TFs to DNA tend to decrease with increasing nucleosome occupancy of the associated binding sites. We further demonstrated that nucleosomal context of binding sites is correlated with gene regulation of the corresponding TF. Nucleosome-depleted TFBSs are linked to high gene activity and low expression noise, whereas nucleosome-covered TFBSs are associated with low gene activity and high expression noise. Moreover, nucleosome-covered TFBSs tend to disrupt coexpression of the corresponding TF target genes. We conclude that nucleosomal context of binding sites influences TF binding affinity, subsequently affecting the regulation of TFs on their target genes. This emphasizes the need to include nucleosomal context of TFBSs in modeling gene regulation.展开更多
This study focuses on bioinformatics search for new regulatory structures in the non-coding DNA, located around the patterns of gene expression levels changed significantly in response to oxidative stress. Hypothesize...This study focuses on bioinformatics search for new regulatory structures in the non-coding DNA, located around the patterns of gene expression levels changed significantly in response to oxidative stress. Hypothesized that all of the genes increase the expression in response to oxidative stress may have the same motifs in non-coding DNA. To search for motifs created an integrated collection database of transcription binding sites - JASPAR, TRANSFAC, Hocomoco TF Homo sapiens, Uniprobe TF Mus musculus. Two types of regulatory regions: the promoter region and the sequence with the capture of potential cis-regulatory modules. In the regulatory regions of genes increase the expression in response to oxidative stress, in contrast to the gene expression level did not change, families of transcription factors identified SOX (1-30) and HX (A, B, C, D).展开更多
Identification of genetic signatures is the main objective for many computational oncology studies. The signature usually consists of numerous genes that are differentially expressed between two clinically distinct gr...Identification of genetic signatures is the main objective for many computational oncology studies. The signature usually consists of numerous genes that are differentially expressed between two clinically distinct groups of samples, such as tumor subtypes. Prospectively, many signatures have been found to generalize poorly to other datasets and, thus, have rarely been accepted into clinical use. Recognizing the limited success of traditionally generated signatures, we developed a systems biology-based framework for robust identification of key transcription factors and their genomic regulatory neighborhoods. Application of the framework to study the differences between gastrointestinal stromal tumor (GIST) and leiomyosarcoma (LMS) resulted in the identification of nine transcription factors (SRF, NKX2-5, CCDC6, LEF1, VDR, ZNF250, TRIM63, MAF, and MYC). Functional annotations of the obtained neighborhoods identified the biological processes which the key transcription factors regulate differently between the tumor types. Analyzing the differences in the expression patterns using our approach resulted in a more robust genetic signature and more biological insight into the diseases compared to a traditional genetic signature.展开更多
Gene transcriptional regulation research is one of the major challenges in the post-genome era. Bioinformatics has become more important with the rapid accumulation of complete genome sequences and the advances of com...Gene transcriptional regulation research is one of the major challenges in the post-genome era. Bioinformatics has become more important with the rapid accumulation of complete genome sequences and the advances of computational methods and related databases. The current computational approaches in promoter prediction, transcription factor binding site identification, composite elements prediction, co-regulation of gene expression analysis and phylogenetic footprinting in the regulatory region analysis are discussed in this review.展开更多
文摘Transcription factors (TFs) are key cellular components that control gene expression. They recognize specific DNA sequences, the TF binding sites (TFBSs), and thus are targeted to specific regions of the genome where they can recruit transcriptional co-factors and/or chromatin regulators to fine-tune spatiotemporal gene regulation. Therefore, the identification of TFBSs in genomic sequences and their subsequent quantitative modeling is of crucial importance for understanding and predicting gene expression. Here, we review how TFBSs can be determined experimentally, how the TFBS models can be constructed in silico, and how they can be optimized by taking into account features such as position interdependence within TFBSs, DNA shape, and/or by introducing state-of-the-art computational algorithms such as deep learning methods. In addition, we discuss the integration of context variables into the TFBS modeling, including nucleosome positioning, chromatin states, methylation patterns, 3D genome architectures, and TF cooperative binding, in order to better predict TF binding under cellular contexts. Finally, we explore the possibilities of combining the optimized TFBS model with technological advances, such as targeted TFBS perturbation by CRISPR, to better understand gene regulation, evolution, and plant diversity.
基金supported by the Yat-Sen Innovative Talents Cultivation Program for Excellent Tutors
文摘Transcription factor (TF) binding to its DNA target site plays an essential role in gene regulation. The location, orientation and spacing of transcription factor binding sites (TFBSs) also affect regulatory function of the TF. However, how nucleosomal context of TFBSs influences TF binding and subsequent gene regulation remains to be elucidated. Using genome-wide nucleosome positioning and TF binding data in budding yeast, we found that binding affinities of TFs to DNA tend to decrease with increasing nucleosome occupancy of the associated binding sites. We further demonstrated that nucleosomal context of binding sites is correlated with gene regulation of the corresponding TF. Nucleosome-depleted TFBSs are linked to high gene activity and low expression noise, whereas nucleosome-covered TFBSs are associated with low gene activity and high expression noise. Moreover, nucleosome-covered TFBSs tend to disrupt coexpression of the corresponding TF target genes. We conclude that nucleosomal context of binding sites influences TF binding affinity, subsequently affecting the regulation of TFs on their target genes. This emphasizes the need to include nucleosomal context of TFBSs in modeling gene regulation.
文摘This study focuses on bioinformatics search for new regulatory structures in the non-coding DNA, located around the patterns of gene expression levels changed significantly in response to oxidative stress. Hypothesized that all of the genes increase the expression in response to oxidative stress may have the same motifs in non-coding DNA. To search for motifs created an integrated collection database of transcription binding sites - JASPAR, TRANSFAC, Hocomoco TF Homo sapiens, Uniprobe TF Mus musculus. Two types of regulatory regions: the promoter region and the sequence with the capture of potential cis-regulatory modules. In the regulatory regions of genes increase the expression in response to oxidative stress, in contrast to the gene expression level did not change, families of transcription factors identified SOX (1-30) and HX (A, B, C, D).
基金supported by Project for the Biological Information and Information Processing Properties of Biological Systems from the Academy of Finland(No.122973)Project for the Structure-dynamics Relationships in Biological Network from the Academy of Finland(No.132877)Finnish Funding Agency for Technology and Innovation Finland Distinguished Professor program(No.1480/31/09)
文摘Identification of genetic signatures is the main objective for many computational oncology studies. The signature usually consists of numerous genes that are differentially expressed between two clinically distinct groups of samples, such as tumor subtypes. Prospectively, many signatures have been found to generalize poorly to other datasets and, thus, have rarely been accepted into clinical use. Recognizing the limited success of traditionally generated signatures, we developed a systems biology-based framework for robust identification of key transcription factors and their genomic regulatory neighborhoods. Application of the framework to study the differences between gastrointestinal stromal tumor (GIST) and leiomyosarcoma (LMS) resulted in the identification of nine transcription factors (SRF, NKX2-5, CCDC6, LEF1, VDR, ZNF250, TRIM63, MAF, and MYC). Functional annotations of the obtained neighborhoods identified the biological processes which the key transcription factors regulate differently between the tumor types. Analyzing the differences in the expression patterns using our approach resulted in a more robust genetic signature and more biological insight into the diseases compared to a traditional genetic signature.
文摘Gene transcriptional regulation research is one of the major challenges in the post-genome era. Bioinformatics has become more important with the rapid accumulation of complete genome sequences and the advances of computational methods and related databases. The current computational approaches in promoter prediction, transcription factor binding site identification, composite elements prediction, co-regulation of gene expression analysis and phylogenetic footprinting in the regulatory region analysis are discussed in this review.