旨在筛选鸡BCO2基因中具有潜在生物学功能的同义单核苷酸多态性(non-synonymous single nucleotide polymorphisms,nsSNPs)。从SNP数据库中检索出8个BCO2 基因nsSNPs,利用SIFT、PolyPhen-2、PANTHER和PROVEAN方法分析引起的氨基酸替换...旨在筛选鸡BCO2基因中具有潜在生物学功能的同义单核苷酸多态性(non-synonymous single nucleotide polymorphisms,nsSNPs)。从SNP数据库中检索出8个BCO2 基因nsSNPs,利用SIFT、PolyPhen-2、PANTHER和PROVEAN方法分析引起的氨基酸替换是否可能影响BCO2 的功能预测。进一步对鸡BCO2基因编码的氨基酸序列进行翻译后修饰位点预测以及进化位点保守性预测;使用SWISS-MODEL构建了BCO2野生型以及突变型蛋白质的空间结构。结果表明:3个nsSNPs(rs739117331、rs735703078和rs736211538)可能严重影响BCO2蛋白功能。展开更多
The establishment of a landscape of enhancers across human cells is crucial to deciphering the mechanism of gene regulation,cell differentiation,and disease development.High-throughput experimental approaches,which co...The establishment of a landscape of enhancers across human cells is crucial to deciphering the mechanism of gene regulation,cell differentiation,and disease development.High-throughput experimental approaches,which contain successfully reported enhancers in typical cell lines,are still too costly and time-consuming to perform systematic identification of enhancers specific to different cell lines.Existing computational methods,capable of predicting regulatory elements purely relying on DNA sequences,lack the power of cell line-specific screening.Recent studies have suggested that chromatin accessibility of a DNA segment is closely related to its potential function in regulation,and thus may provide useful information in identifying regulatory elements.Motivated by the aforementioned understanding,we integrate DNA sequences and chromatin accessibility data to accurately predict enhancers in a cell line-specific manner.We proposed Deep CAPE,a deep convolutional neural network to predict enhancers via the integration of DNA sequences and DNase-seq data.Benefitting from the well-designed feature extraction mechanism and skip connection strategy,our model not only consistently outperforms existing methods in the imbalanced classification of cell line-specific enhancers against background sequences,but also has the ability to self-adapt to different sizes of datasets.Besides,with the adoption of autoencoder,our model is capable of making cross-cell line predictions.We further visualize kernels of the first convolutional layer and show the match of identified sequence signatures and known motifs.We finally demonstrate the potential ability of our model to explain functional implications of putative disease-associated genetic variants and discriminate diseaserelated enhancers.The source code and detailed tutorial of Deep CAPE are freely available at https://github.com/Shengquan Chen/DeepCAPE.展开更多
A substitution on an amino acid sequence can be defined as "intolerant" (non-neutral) or "tolerant" (neutral) according to whether or not it detectably alters protein phenotypes (e.g.,
基金partially supported by the National Key R&D Program of China(Grant No.2018YFC0910404)the National Natural Science Foundation of China(Grant Nos.61873141,61721003,61573207,71871019,71471016,71531013,and 71729001)the Tsinghua-Fuzhou Institute for Data Technology,China。
文摘The establishment of a landscape of enhancers across human cells is crucial to deciphering the mechanism of gene regulation,cell differentiation,and disease development.High-throughput experimental approaches,which contain successfully reported enhancers in typical cell lines,are still too costly and time-consuming to perform systematic identification of enhancers specific to different cell lines.Existing computational methods,capable of predicting regulatory elements purely relying on DNA sequences,lack the power of cell line-specific screening.Recent studies have suggested that chromatin accessibility of a DNA segment is closely related to its potential function in regulation,and thus may provide useful information in identifying regulatory elements.Motivated by the aforementioned understanding,we integrate DNA sequences and chromatin accessibility data to accurately predict enhancers in a cell line-specific manner.We proposed Deep CAPE,a deep convolutional neural network to predict enhancers via the integration of DNA sequences and DNase-seq data.Benefitting from the well-designed feature extraction mechanism and skip connection strategy,our model not only consistently outperforms existing methods in the imbalanced classification of cell line-specific enhancers against background sequences,but also has the ability to self-adapt to different sizes of datasets.Besides,with the adoption of autoencoder,our model is capable of making cross-cell line predictions.We further visualize kernels of the first convolutional layer and show the match of identified sequence signatures and known motifs.We finally demonstrate the potential ability of our model to explain functional implications of putative disease-associated genetic variants and discriminate diseaserelated enhancers.The source code and detailed tutorial of Deep CAPE are freely available at https://github.com/Shengquan Chen/DeepCAPE.
基金supported by the National Natural Science Foundation of China (30870827)
文摘A substitution on an amino acid sequence can be defined as "intolerant" (non-neutral) or "tolerant" (neutral) according to whether or not it detectably alters protein phenotypes (e.g.,