AIM To identify punitive transcriptional factor binding sites(TFBS) from regulatory single nucleotide polymorphisms(rS NPs) that are significantly associated with disease.METHODS The genome-wide association studies ha...AIM To identify punitive transcriptional factor binding sites(TFBS) from regulatory single nucleotide polymorphisms(rS NPs) that are significantly associated with disease.METHODS The genome-wide association studies have provided us with nearly 6500 disease or trait-predisposing SNPs where 93% are located within non-coding regions such as gene regulatory or intergenic areas of the genome. In the regulatory region of a gene, a SNP can change the DNA sequence of a transcriptional factor(TF) motif and in turn may affect the process of gene regulation. SNP changes that affect gene expression and impact gene regulatory sequences such as promoters, enhancers, and silencers are known as rS NPs. Computational tools can be used to identify unique punitive TFBS created by rS NPs that are associated with disease or sickness. Computational analysis was used to identify punitive TFBS generated by the alleles of these rS NPs.RESULTS r SNPs within nine genes that have been significantly associated with disease or sickness were used to illustrate the tremendous diversity of punitive unique TFBS that can be generated by their alleles. The genes studied are the adrenergic, beta, receptor kinase 1, the v-akt murine thymoma viral oncogene homolog 3, the activating transcription factor 3, the type 2 demodkinase gene, the endothetal Per-Arnt-Sim domain protein 1, the lysosomal acid lipase A, the signal Transducer and Activator of Transcription 4, the thromboxane A2 receptor and the vascular endothelial growth factor A. From this sampling of SNPs among the nine genes, there are 73 potential unique TFBS generated by the common alleles comparedto 124 generated by the minor alleles indicating the tremendous diversity of potential TFs that are capable of regulating these genes.CONCLUSION From the diversity of unique punitive binding sites for TFs, it was found that some TFs play a role in the disease or sickness being studied.展开更多
This study focuses on bioinformatics search for new regulatory structures in the non-coding DNA, located around the patterns of gene expression levels changed significantly in response to oxidative stress. Hypothesize...This study focuses on bioinformatics search for new regulatory structures in the non-coding DNA, located around the patterns of gene expression levels changed significantly in response to oxidative stress. Hypothesized that all of the genes increase the expression in response to oxidative stress may have the same motifs in non-coding DNA. To search for motifs created an integrated collection database of transcription binding sites - JASPAR, TRANSFAC, Hocomoco TF Homo sapiens, Uniprobe TF Mus musculus. Two types of regulatory regions: the promoter region and the sequence with the capture of potential cis-regulatory modules. In the regulatory regions of genes increase the expression in response to oxidative stress, in contrast to the gene expression level did not change, families of transcription factors identified SOX (1-30) and HX (A, B, C, D).展开更多
Transcription factors (TFs) are the core sentinels of gene regulation functioning by binding to highly specific DNA sequences to activate or repress the recruitment of RNA polymerase. The ability to identify transcrip...Transcription factors (TFs) are the core sentinels of gene regulation functioning by binding to highly specific DNA sequences to activate or repress the recruitment of RNA polymerase. The ability to identify transcription factor binding sites (TFBSs) is necessary to understand gene regulation and infer regulatory networks. Despite the fact that bioinformatics tools have been developed for years to improve computational identification of TFBSs, the accurate prediction still remains changeling as DNA motifs recognized by TFs are typically short and often lack obvious patterns. In this study we introduced a new attribute-motif distribution pattern (MDP) to assist in TFBS prediction. MDP was developed using a TF distribution pattern curve generated by analyzing 25 yeast TFs and 37 of their experimentally validated binding motifs, followed by calculating a scoring value to quantify the reliability of each motif prediction. Finally, MDP was tested using another set of 7 TFs with known binding sites to in silico validate the approach. The method was further tested in a non-yeast system using the filamentous fungus Magnaporthe oryzae transcription factor MoCRZ1. We demonstrate superior prediction reranking results using MDP over the commonly used program MEME and the other four predictors. The data showed significant improvements in the ranking of validated TFBS and provides a more sensitive statistics based approach for motif discovery.展开更多
Understanding the functional effects of genetic variants is crucial in modern genomics and genetics. Transcription factor binding sites (TFBSs) are one of the most important cis-regulatory elements. While multiple t...Understanding the functional effects of genetic variants is crucial in modern genomics and genetics. Transcription factor binding sites (TFBSs) are one of the most important cis-regulatory elements. While multiple tools have been developed to assess functional effects of genetic variants at TFBSs, they usually assume that each variant works in isolation and neglect the potential "interference" among multiple variants within the same TFBS. In this study, we presented COPE-TFBS (Context-Oriented Predictor for variant Effect on Transcription Factor Binding Site), a novel method that considers sequence context to accurately predict variant effects on TFBSs. We systematically re-analyzed the sequencing data from both the 1000 Genomes Project and the Genotype-Tissue Expression (GTEx) Project via COPE-TFBS, and identified numbers of novel TFBSs, transformed TFBSs and discordantly annotated TFBSs resulting from multiple variants, further highlighting the necessity of sequence context in accurately annotating genetic variants.展开更多
Transcription Factors(TFs) are a very diverse family of DNA-binding proteins that play essential roles in the regulation of gene expression through binding to specific DNA sequences. They are considered as one of th...Transcription Factors(TFs) are a very diverse family of DNA-binding proteins that play essential roles in the regulation of gene expression through binding to specific DNA sequences. They are considered as one of the prime drug targets since mutations and aberrant TF-DNA interactions are implicated in many diseases.Identification of TF-binding sites on a genomic scale represents a critical step in delineating transcription regulatory networks and remains a major goal in genomic annotations. Recent development of experimental high-throughput technologies has provided valuable information about TF-binding sites at genome scale under various physiological and developmental conditions. Computational approaches can provide a cost-effective alternative and complement the experimental methods by using the vast quantities of available sequence or structural information. In this review we focus on structure-based prediction of transcription factor binding sites. In addition to its potential in genomescale predictions, structure-based approaches can help us better understand the TF-DNA interaction mechanisms and the evolution of transcription factors and their target binding sites. The success of structure-based methods also bears a translational impact on targeted drug design in medicine and biotechnology.展开更多
Recent advances in the development of high-throughput tools have significantly revolutionized our understanding of molecular mech- anisms underlying normal and dysfunctional biological processes. Here we present a nov...Recent advances in the development of high-throughput tools have significantly revolutionized our understanding of molecular mech- anisms underlying normal and dysfunctional biological processes. Here we present a novel computational tool, transcription factor search and analysis tool (TrFAST), which was developed for the in silico analysis of transcription factor binding sites (TFBSs) of sig- naling pathway-specific TFs. TrFAST facilitates searching as well as comparative analysis of regulatory motifs through an exact pattern matching algorithm followed by the graphical representation of matched binding sites in multiple sequences up to 50 kb in length. TrFAST is proficient in reducing the number of comparisons by the exact pattern matching strategy. In contrast to the pre-existing tools that find TFBS in a single sequence, TrFAST seeks out the desired pattern in multiple sequences simultaneously. It counts the GC con- tent within the given multiple sequence data set and assembles the combinational details of consensus sequence(s) located at these regions, thereby generating a visual display based on the abundance of unique pattern. Comparative regulatory region analysis of multi- ple orthologous sequences simultaneously enhances the features of TrFAST and provides a significant insight into study of conservation of non-coding cis-regulatory elements. TrFAST is freely available at http://www.fi-pk.com/trfast.html.展开更多
Transcription factor (TF) binding to its DNA target site plays an essential role in gene regulation. The location, orientation and spacing of transcription factor binding sites (TFBSs) also affect regulatory funct...Transcription factor (TF) binding to its DNA target site plays an essential role in gene regulation. The location, orientation and spacing of transcription factor binding sites (TFBSs) also affect regulatory function of the TF. However, how nucleosomal context of TFBSs influences TF binding and subsequent gene regulation remains to be elucidated. Using genome-wide nucleosome positioning and TF binding data in budding yeast, we found that binding affinities of TFs to DNA tend to decrease with increasing nucleosome occupancy of the associated binding sites. We further demonstrated that nucleosomal context of binding sites is correlated with gene regulation of the corresponding TF. Nucleosome-depleted TFBSs are linked to high gene activity and low expression noise, whereas nucleosome-covered TFBSs are associated with low gene activity and high expression noise. Moreover, nucleosome-covered TFBSs tend to disrupt coexpression of the corresponding TF target genes. We conclude that nucleosomal context of binding sites influences TF binding affinity, subsequently affecting the regulation of TFs on their target genes. This emphasizes the need to include nucleosomal context of TFBSs in modeling gene regulation.展开更多
Transcription factors (TFs) are key cellular components that control gene expression. They recognize specific DNA sequences, the TF binding sites (TFBSs), and thus are targeted to specific regions of the genome where ...Transcription factors (TFs) are key cellular components that control gene expression. They recognize specific DNA sequences, the TF binding sites (TFBSs), and thus are targeted to specific regions of the genome where they can recruit transcriptional co-factors and/or chromatin regulators to fine-tune spatiotemporal gene regulation. Therefore, the identification of TFBSs in genomic sequences and their subsequent quantitative modeling is of crucial importance for understanding and predicting gene expression. Here, we review how TFBSs can be determined experimentally, how the TFBS models can be constructed in silico, and how they can be optimized by taking into account features such as position interdependence within TFBSs, DNA shape, and/or by introducing state-of-the-art computational algorithms such as deep learning methods. In addition, we discuss the integration of context variables into the TFBS modeling, including nucleosome positioning, chromatin states, methylation patterns, 3D genome architectures, and TF cooperative binding, in order to better predict TF binding under cellular contexts. Finally, we explore the possibilities of combining the optimized TFBS model with technological advances, such as targeted TFBS perturbation by CRISPR, to better understand gene regulation, evolution, and plant diversity.展开更多
Knowledge of the transcription factor binding landscape(TFBL)is necessary to analyze gene regulatory networks for important agronomic traits.However,a low-cost and high-throughput in vivo chromatin profiling method is...Knowledge of the transcription factor binding landscape(TFBL)is necessary to analyze gene regulatory networks for important agronomic traits.However,a low-cost and high-throughput in vivo chromatin profiling method is still lacking in plants.Here,we developed a transient and simplified cleavage under targets and tagmentation(tsCUT&Tag)that combines transient expression of transcription factor proteins in protoplasts with a simplified CUT&Tag without nucleus extraction.Our tsCUT&Tag method provided higher data quality and signal resolution with lower sequencing depth compared with traditional ChIP-seq.Furthermore,we developed a strategy combining tsCUT&Tag with machine learning,which has great potential for profiling the TFBL across plant development.展开更多
Transcription factors play an indispensable role in maintaining cellular viability and finely regulating complex internal metabolic networks.These crucial bioactive functions rely on their ability to respond to effect...Transcription factors play an indispensable role in maintaining cellular viability and finely regulating complex internal metabolic networks.These crucial bioactive functions rely on their ability to respond to effectors and concurrently interact with binding sites.Recent advancements have brought innovative insights into the understanding of transcription factors.In this review,we comprehensively summarize the mechanisms by which transcription factors carry out their functions,along with calculation and experimental-based methods employed in their identification.Additionally,we highlight recent achievements in the application of transcription factors in various biotechnological fields,including cell engineering,human health,and biomanufacturing.Finally,the current limitations of research and provide prospects for future investigations are discussed.This review will provide enlightening theoretical guidance for transcription factors engineering.展开更多
文摘AIM To identify punitive transcriptional factor binding sites(TFBS) from regulatory single nucleotide polymorphisms(rS NPs) that are significantly associated with disease.METHODS The genome-wide association studies have provided us with nearly 6500 disease or trait-predisposing SNPs where 93% are located within non-coding regions such as gene regulatory or intergenic areas of the genome. In the regulatory region of a gene, a SNP can change the DNA sequence of a transcriptional factor(TF) motif and in turn may affect the process of gene regulation. SNP changes that affect gene expression and impact gene regulatory sequences such as promoters, enhancers, and silencers are known as rS NPs. Computational tools can be used to identify unique punitive TFBS created by rS NPs that are associated with disease or sickness. Computational analysis was used to identify punitive TFBS generated by the alleles of these rS NPs.RESULTS r SNPs within nine genes that have been significantly associated with disease or sickness were used to illustrate the tremendous diversity of punitive unique TFBS that can be generated by their alleles. The genes studied are the adrenergic, beta, receptor kinase 1, the v-akt murine thymoma viral oncogene homolog 3, the activating transcription factor 3, the type 2 demodkinase gene, the endothetal Per-Arnt-Sim domain protein 1, the lysosomal acid lipase A, the signal Transducer and Activator of Transcription 4, the thromboxane A2 receptor and the vascular endothelial growth factor A. From this sampling of SNPs among the nine genes, there are 73 potential unique TFBS generated by the common alleles comparedto 124 generated by the minor alleles indicating the tremendous diversity of potential TFs that are capable of regulating these genes.CONCLUSION From the diversity of unique punitive binding sites for TFs, it was found that some TFs play a role in the disease or sickness being studied.
文摘This study focuses on bioinformatics search for new regulatory structures in the non-coding DNA, located around the patterns of gene expression levels changed significantly in response to oxidative stress. Hypothesized that all of the genes increase the expression in response to oxidative stress may have the same motifs in non-coding DNA. To search for motifs created an integrated collection database of transcription binding sites - JASPAR, TRANSFAC, Hocomoco TF Homo sapiens, Uniprobe TF Mus musculus. Two types of regulatory regions: the promoter region and the sequence with the capture of potential cis-regulatory modules. In the regulatory regions of genes increase the expression in response to oxidative stress, in contrast to the gene expression level did not change, families of transcription factors identified SOX (1-30) and HX (A, B, C, D).
文摘Transcription factors (TFs) are the core sentinels of gene regulation functioning by binding to highly specific DNA sequences to activate or repress the recruitment of RNA polymerase. The ability to identify transcription factor binding sites (TFBSs) is necessary to understand gene regulation and infer regulatory networks. Despite the fact that bioinformatics tools have been developed for years to improve computational identification of TFBSs, the accurate prediction still remains changeling as DNA motifs recognized by TFs are typically short and often lack obvious patterns. In this study we introduced a new attribute-motif distribution pattern (MDP) to assist in TFBS prediction. MDP was developed using a TF distribution pattern curve generated by analyzing 25 yeast TFs and 37 of their experimentally validated binding motifs, followed by calculating a scoring value to quantify the reliability of each motif prediction. Finally, MDP was tested using another set of 7 TFs with known binding sites to in silico validate the approach. The method was further tested in a non-yeast system using the filamentous fungus Magnaporthe oryzae transcription factor MoCRZ1. We demonstrate superior prediction reranking results using MDP over the commonly used program MEME and the other four predictors. The data showed significant improvements in the ranking of validated TFBS and provides a more sensitive statistics based approach for motif discovery.
基金supported by funds from the National Key R&D Program of China (2016YFC0901603)the China 863 Program (2015AA020108)+1 种基金the State Key Laboratory of Protein and Plant Gene Researchsupported in part by the National Program for Support of Top-notch Young Professionals
文摘Understanding the functional effects of genetic variants is crucial in modern genomics and genetics. Transcription factor binding sites (TFBSs) are one of the most important cis-regulatory elements. While multiple tools have been developed to assess functional effects of genetic variants at TFBSs, they usually assume that each variant works in isolation and neglect the potential "interference" among multiple variants within the same TFBS. In this study, we presented COPE-TFBS (Context-Oriented Predictor for variant Effect on Transcription Factor Binding Site), a novel method that considers sequence context to accurately predict variant effects on TFBSs. We systematically re-analyzed the sequencing data from both the 1000 Genomes Project and the Genotype-Tissue Expression (GTEx) Project via COPE-TFBS, and identified numbers of novel TFBSs, transformed TFBSs and discordantly annotated TFBSs resulting from multiple variants, further highlighting the necessity of sequence context in accurately annotating genetic variants.
基金supported by the National Science Foundation #DBI-0844749 and #DBI-1356459 to JTG
文摘Transcription Factors(TFs) are a very diverse family of DNA-binding proteins that play essential roles in the regulation of gene expression through binding to specific DNA sequences. They are considered as one of the prime drug targets since mutations and aberrant TF-DNA interactions are implicated in many diseases.Identification of TF-binding sites on a genomic scale represents a critical step in delineating transcription regulatory networks and remains a major goal in genomic annotations. Recent development of experimental high-throughput technologies has provided valuable information about TF-binding sites at genome scale under various physiological and developmental conditions. Computational approaches can provide a cost-effective alternative and complement the experimental methods by using the vast quantities of available sequence or structural information. In this review we focus on structure-based prediction of transcription factor binding sites. In addition to its potential in genomescale predictions, structure-based approaches can help us better understand the TF-DNA interaction mechanisms and the evolution of transcription factors and their target binding sites. The success of structure-based methods also bears a translational impact on targeted drug design in medicine and biotechnology.
基金supported by Higher Education Commission, Pakistan(Grant No.20-1493/R&D/09)
文摘Recent advances in the development of high-throughput tools have significantly revolutionized our understanding of molecular mech- anisms underlying normal and dysfunctional biological processes. Here we present a novel computational tool, transcription factor search and analysis tool (TrFAST), which was developed for the in silico analysis of transcription factor binding sites (TFBSs) of sig- naling pathway-specific TFs. TrFAST facilitates searching as well as comparative analysis of regulatory motifs through an exact pattern matching algorithm followed by the graphical representation of matched binding sites in multiple sequences up to 50 kb in length. TrFAST is proficient in reducing the number of comparisons by the exact pattern matching strategy. In contrast to the pre-existing tools that find TFBS in a single sequence, TrFAST seeks out the desired pattern in multiple sequences simultaneously. It counts the GC con- tent within the given multiple sequence data set and assembles the combinational details of consensus sequence(s) located at these regions, thereby generating a visual display based on the abundance of unique pattern. Comparative regulatory region analysis of multi- ple orthologous sequences simultaneously enhances the features of TrFAST and provides a significant insight into study of conservation of non-coding cis-regulatory elements. TrFAST is freely available at http://www.fi-pk.com/trfast.html.
基金supported by the Yat-Sen Innovative Talents Cultivation Program for Excellent Tutors
文摘Transcription factor (TF) binding to its DNA target site plays an essential role in gene regulation. The location, orientation and spacing of transcription factor binding sites (TFBSs) also affect regulatory function of the TF. However, how nucleosomal context of TFBSs influences TF binding and subsequent gene regulation remains to be elucidated. Using genome-wide nucleosome positioning and TF binding data in budding yeast, we found that binding affinities of TFs to DNA tend to decrease with increasing nucleosome occupancy of the associated binding sites. We further demonstrated that nucleosomal context of binding sites is correlated with gene regulation of the corresponding TF. Nucleosome-depleted TFBSs are linked to high gene activity and low expression noise, whereas nucleosome-covered TFBSs are associated with low gene activity and high expression noise. Moreover, nucleosome-covered TFBSs tend to disrupt coexpression of the corresponding TF target genes. We conclude that nucleosomal context of binding sites influences TF binding affinity, subsequently affecting the regulation of TFs on their target genes. This emphasizes the need to include nucleosomal context of TFBSs in modeling gene regulation.
文摘Transcription factors (TFs) are key cellular components that control gene expression. They recognize specific DNA sequences, the TF binding sites (TFBSs), and thus are targeted to specific regions of the genome where they can recruit transcriptional co-factors and/or chromatin regulators to fine-tune spatiotemporal gene regulation. Therefore, the identification of TFBSs in genomic sequences and their subsequent quantitative modeling is of crucial importance for understanding and predicting gene expression. Here, we review how TFBSs can be determined experimentally, how the TFBS models can be constructed in silico, and how they can be optimized by taking into account features such as position interdependence within TFBSs, DNA shape, and/or by introducing state-of-the-art computational algorithms such as deep learning methods. In addition, we discuss the integration of context variables into the TFBS modeling, including nucleosome positioning, chromatin states, methylation patterns, 3D genome architectures, and TF cooperative binding, in order to better predict TF binding under cellular contexts. Finally, we explore the possibilities of combining the optimized TFBS model with technological advances, such as targeted TFBS perturbation by CRISPR, to better understand gene regulation, evolution, and plant diversity.
基金supported by the National Natural Science Foundation of China(31922068)the Fundamental Research Funds for the Central Universities(ZK202101)the China Postdoctoral Science Foundation(2019M662666)。
文摘Knowledge of the transcription factor binding landscape(TFBL)is necessary to analyze gene regulatory networks for important agronomic traits.However,a low-cost and high-throughput in vivo chromatin profiling method is still lacking in plants.Here,we developed a transient and simplified cleavage under targets and tagmentation(tsCUT&Tag)that combines transient expression of transcription factor proteins in protoplasts with a simplified CUT&Tag without nucleus extraction.Our tsCUT&Tag method provided higher data quality and signal resolution with lower sequencing depth compared with traditional ChIP-seq.Furthermore,we developed a strategy combining tsCUT&Tag with machine learning,which has great potential for profiling the TFBL across plant development.
基金supported by National Key Research&Development Program of China(2018YFA0900504,2020YFA0907700,and 2018YFA0900300)the National Natural Foundation of China(31401674)+1 种基金the National First-Class Discipline Program of Light Industry Technology and Engineering(LITE2018-22)the Top-notch Academic Programs Project of Jiangsu Higher Education Institutions.This research grant was awarded to author Youran Li.
文摘Transcription factors play an indispensable role in maintaining cellular viability and finely regulating complex internal metabolic networks.These crucial bioactive functions rely on their ability to respond to effectors and concurrently interact with binding sites.Recent advancements have brought innovative insights into the understanding of transcription factors.In this review,we comprehensively summarize the mechanisms by which transcription factors carry out their functions,along with calculation and experimental-based methods employed in their identification.Additionally,we highlight recent achievements in the application of transcription factors in various biotechnological fields,including cell engineering,human health,and biomanufacturing.Finally,the current limitations of research and provide prospects for future investigations are discussed.This review will provide enlightening theoretical guidance for transcription factors engineering.