疾病致病基因的发现是基因组研究的重大挑战之一.近年,随着生物学数据的积累,许多研究人员利用计算方法进行致病基因预测.但其中大多都基于基因相互作用网络或其他相似性网络数据等,很少考虑特定基因的局部网络连接与它们的差异表达信...疾病致病基因的发现是基因组研究的重大挑战之一.近年,随着生物学数据的积累,许多研究人员利用计算方法进行致病基因预测.但其中大多都基于基因相互作用网络或其他相似性网络数据等,很少考虑特定基因的局部网络连接与它们的差异表达信息之间的潜在联系.本文基于基因相互作用局部网络结构和基因的差异表达信息探索癌症致病基因及其邻居基因的生物特性,并依据新发现的特性采用机器学习方法进行癌症致病基因预测.首先,从TCGA(The Cancer Genome Atlas)数据库和OMIM(Online Mendelian Inheritance in Man)数据库中获取21种癌症相关的基因表达数据及其致病基因数据,并依次将人类蛋白质相互作用网络和各癌症对应的组织特异性相互作用网络作为背景网络,分析不同生物学网络的邻域信息和患病前后基因表达的变化信息之间的潜在生物特性.接着基于发现的生物特性定义基因节点特征的向量表示方法,并采用支持向量机进行致病基因预测.实验结果通过ICGC(International Cancer Genome Consortium), COSMIC(Catalogue Of Somatic Mutations In Cancer), NCG(Network of Cancer Genes), OncoKB(Oncology Knowledge Base)等标准数据库和相关文献,以及疾病注释和通路富集进行验证.结果表明,根据发现的致病基因的特性进行基因特征定义,能够将癌症致病基因与其他基因进行区分,并为癌症致病基因预测提供有力假设,为相关生物实验提供可靠致病基因候选集,进而推动对癌症这一复杂疾病致病机理的研究.展开更多
Objective To explore the differential expression and mechanisms of bone formation-related genes in osteoporosis(OP)leveraging bioinformatics and machine learning methodologies;and to predict the active ingredients of ...Objective To explore the differential expression and mechanisms of bone formation-related genes in osteoporosis(OP)leveraging bioinformatics and machine learning methodologies;and to predict the active ingredients of targeted traditional Chinese medicine(TCM)herbs.Methods The Gene Expression Omnibus(GEO)and GeneCards databases were employed to conduct a comprehensive screening of genes and disease-associated loci pertinent to the pathogenesis of OP.The R package was utilized as the analytical tool for the identification of differentially expressed genes.Least absolute shrinkage and selection operator(LASSO)logis-tic regression analysis and support vector machine-recursive feature elimination(SVM-RFE)algorithm were employed in defining the genetic signature specific to OP.Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)pathway enrichment analyses for the selected pivotal genes were conducted.The cell-type identification by estimating rela-tive subsets of RNA transcripts(CIBERSORT)algorithm was leveraged to examine the infiltra-tion patterns of immune cells;with Spearman’s rank correlation analysis utilized to assess the relationship between the expression levels of the genes and the presence of immune cells.Coremine Medical Database was used to screen out potential TCM herbs for the treatment of OP.Comparative Toxicogenomics Database(CTD)was employed for forecasting the TCM ac-tive ingredients targeting the key genes.AutoDock Vina 1.2.2 and GROMACS 2020 softwares were employed to conclude analysis results;facilitating the exploration of binding affinity and conformational dynamics between the TCM active ingredients and their biological targets.Results Ten genes were identified by intersecting the results from the GEO and GeneCards databases.Through the application of LASSO regression and SVM-RFE algorithm;four piv-otal genes were selected:coat protein(CP);kallikrein 3(KLK3);polymeraseγ(POLG);and transient receptor potential vanilloid 4(TRPV4).GO and KEGG pathway enrichment analy-ses revealed that these trait genes were predominantly engaged in the regulation of defense response activation;maintenance of cellular metal ion balance;and the production of chemokine ligand 5.These genes were notably associated with signaling pathways such as ferroptosis;porphyrin metabolism;and base excision repair.Immune infiltration analysis showed that key genes were highly correlated with immune cells.Macrophage M0;M1;M2;and resting dendritic cell were significantly different between groups;and there were signifi-cant differences between different groups(P<0.05).The interaction counts of resveratrol;curcumin;and quercetin with KLK3 were 7;3;and 2;respectively.It shows that the interac-tions of resveratrol;curcumin;and quercetin with KLK3 were substantial.Molecular docking and molecular dynamics simulations further confirmed the robust binding affinity of these bioactive compounds to the target genes.Conclusion Pivotal genes including CP;KLK3;POLG;and TRPV4;exhibited commendable significant prognostic value;and played a crucial role in the diagnostic assessment of OP.Resveratrol;curcumin;and quercetin;natural compounds found in TCM;showed promise in their potential to effectively modulate the bone-forming gene KLK3.This study provides a sci-entific basis for the interpretation of the pathogenesis of OP and the development of clinical drugs.展开更多
Objective Various treatments have greatly reduced the mortality of hepatocellular carcinoma (HCC). However, few therapies could be performed in advanced HCC. Therefore, understanding the characteristics of HCC at th...Objective Various treatments have greatly reduced the mortality of hepatocellular carcinoma (HCC). However, few therapies could be performed in advanced HCC. Therefore, understanding the characteristics of HCC at the level of the whole transcriptome can help prevent the progression of HCC. Methods: The aim of this study was to identify differently expressed genes and potent pathways between normal liver and HCC tissues. The gene expression profiles of GSE104627 were downloaded from Gene Expression Omnibus database. The Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses were performed and protein-protein interaction network of the differentially expressed genes were constructed by Cytoscape software. Results: In total, 880 differently expressed genes were identified between normal and tumor tissues, including 554 up-regulated genes and 326 down-regulated genes. Gene Ontology analysis results showed that the up-regulated genes were significantly enriched in establishment of RNA localization, nucleic acid transport, RNA transport, RNA localization and nucleobase, nucleoside, nucleotide and nucleic acid transport. Kyoto Encyclopedia of Genes and Genomes pathway analysis showed the up-regulated genes were enriched in axon guidance, dorso-ventral axis formation and pathways in cancer. The top 10 hub genes were identified from the protein - protein interaction network, and sub-networks revealed these genes were involved in significant pathways, including G protein-coupled receptors signaling pathway, signaling pathway via MAPK and extracellular matrix organization. Conclusion: The present study described the differently expressed genes between normal tissues and HCC tissues from the level of gene transcription. The possible signaling pathways involved in the development of HCC and related molecules involved were analyzed. However, further laboratory and clinical validation is still needed.展开更多
The total RNA was extracted from Microtusfortis liver tissue which before being infected and after being infected 10 d and 15 d by the Schistosoma japonicum cercariae. Using Rattus norvegicus CD36 gene probe to hybrid...The total RNA was extracted from Microtusfortis liver tissue which before being infected and after being infected 10 d and 15 d by the Schistosoma japonicum cercariae. Using Rattus norvegicus CD36 gene probe to hybridize analysis of CD36 difference expression in the Microtus fortis liver tissues which were infected with Schistosorna japonicum before and after being infected. At the same time, the cDNA sequence and encoded amino acid sequence of the Rattus norvegicus CD36 gene and CD36 protein structural domains were analysized by using bioinformatics. The results showed that the CD36 expression levels in the liver tissue of Microtus fortis after being infected were significantly higher than before being infectied. The Rattus norvegicus CD36 cDNA sequence of a total length is 1625 bp and encoded 472 amino acid residues and Rattus norvegicus CD36 protein containing a CD36 superfamily domain.展开更多
RNA-Seq technology is becoming widely used in various transcriptomics studies;however,analyzing and interpreting the RNA-Seq data face serious challenges.With the development of high-throughput sequencing technologies...RNA-Seq technology is becoming widely used in various transcriptomics studies;however,analyzing and interpreting the RNA-Seq data face serious challenges.With the development of high-throughput sequencing technologies,the sequencing cost is dropping dramatically with the sequencing output increasing sharply.However,the sequencing reads are still short in length and contain various sequencing errors.Moreover,the intricate transcriptome is always more complicated than we expect.These challenges proffer the urgent need of efficient bioinformatics algorithms to effectively handle the large amount of transcriptome sequencing data and carry out diverse related studies.This review summarizes a number of frequently-used applications of transcriptome sequencing and their related analyzing strategies,including short read mapping,exon-exon splice junction detection,gene or isoform expression quantification,differential expression analysis and transcriptome reconstruction.展开更多
To identify novel genes in castration-resistant prostate cancer(CRPC),we downloaded three microarray datasets containing CRPC and primary prostate cancer in Gene Expression Omnibus(GEO).R packages affy and limma were ...To identify novel genes in castration-resistant prostate cancer(CRPC),we downloaded three microarray datasets containing CRPC and primary prostate cancer in Gene Expression Omnibus(GEO).R packages affy and limma were performed to identify differentially expressed genes(DEGs)between primary prostate cancer and CRPC.After that,we performed functional enrichment analysis including gene ontology(GO)and Kyoto encyclopedia of genes and genomes(KEGG)pathway.In addition,protein–protein interaction(PPI)analysis was used to search for hub genes.Finally,to validate the significance of these genes,we performed survival analysis.As a result,we identified 53 upregulated genes and 58 downregulated genes that changed in at least two datasets.Functional enrichment analysis showed significant changes in the positive regulation of osteoblast differentiation pathway and aldosteroneregulated sodium reabsorption pathway.PPI network identified hub genes like cortactin-binding protein 2(CTTNBP2),Rho family guanosine triphosphatase(GTPase)3(RND3),protein tyrosine phosphatase receptor-type R(PTPRR),Jagged1(JAG1),and lumican(LUM).Based on PPI network analysis and functional enrichment analysis,we identified two genes(PTPRR and JAG1)as key genes.Further survival analysis indicated a relationship between high expression of the two genes and poor prognosis of prostate cancer.In conclusion,PTPRR and JAG1 are key genes in the CRPC,which may serve as promising biomarkers of diagnosis and prognosis of CRPC.展开更多
文摘疾病致病基因的发现是基因组研究的重大挑战之一.近年,随着生物学数据的积累,许多研究人员利用计算方法进行致病基因预测.但其中大多都基于基因相互作用网络或其他相似性网络数据等,很少考虑特定基因的局部网络连接与它们的差异表达信息之间的潜在联系.本文基于基因相互作用局部网络结构和基因的差异表达信息探索癌症致病基因及其邻居基因的生物特性,并依据新发现的特性采用机器学习方法进行癌症致病基因预测.首先,从TCGA(The Cancer Genome Atlas)数据库和OMIM(Online Mendelian Inheritance in Man)数据库中获取21种癌症相关的基因表达数据及其致病基因数据,并依次将人类蛋白质相互作用网络和各癌症对应的组织特异性相互作用网络作为背景网络,分析不同生物学网络的邻域信息和患病前后基因表达的变化信息之间的潜在生物特性.接着基于发现的生物特性定义基因节点特征的向量表示方法,并采用支持向量机进行致病基因预测.实验结果通过ICGC(International Cancer Genome Consortium), COSMIC(Catalogue Of Somatic Mutations In Cancer), NCG(Network of Cancer Genes), OncoKB(Oncology Knowledge Base)等标准数据库和相关文献,以及疾病注释和通路富集进行验证.结果表明,根据发现的致病基因的特性进行基因特征定义,能够将癌症致病基因与其他基因进行区分,并为癌症致病基因预测提供有力假设,为相关生物实验提供可靠致病基因候选集,进而推动对癌症这一复杂疾病致病机理的研究.
基金National Natural Science Foundation of China(81960877).
文摘Objective To explore the differential expression and mechanisms of bone formation-related genes in osteoporosis(OP)leveraging bioinformatics and machine learning methodologies;and to predict the active ingredients of targeted traditional Chinese medicine(TCM)herbs.Methods The Gene Expression Omnibus(GEO)and GeneCards databases were employed to conduct a comprehensive screening of genes and disease-associated loci pertinent to the pathogenesis of OP.The R package was utilized as the analytical tool for the identification of differentially expressed genes.Least absolute shrinkage and selection operator(LASSO)logis-tic regression analysis and support vector machine-recursive feature elimination(SVM-RFE)algorithm were employed in defining the genetic signature specific to OP.Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)pathway enrichment analyses for the selected pivotal genes were conducted.The cell-type identification by estimating rela-tive subsets of RNA transcripts(CIBERSORT)algorithm was leveraged to examine the infiltra-tion patterns of immune cells;with Spearman’s rank correlation analysis utilized to assess the relationship between the expression levels of the genes and the presence of immune cells.Coremine Medical Database was used to screen out potential TCM herbs for the treatment of OP.Comparative Toxicogenomics Database(CTD)was employed for forecasting the TCM ac-tive ingredients targeting the key genes.AutoDock Vina 1.2.2 and GROMACS 2020 softwares were employed to conclude analysis results;facilitating the exploration of binding affinity and conformational dynamics between the TCM active ingredients and their biological targets.Results Ten genes were identified by intersecting the results from the GEO and GeneCards databases.Through the application of LASSO regression and SVM-RFE algorithm;four piv-otal genes were selected:coat protein(CP);kallikrein 3(KLK3);polymeraseγ(POLG);and transient receptor potential vanilloid 4(TRPV4).GO and KEGG pathway enrichment analy-ses revealed that these trait genes were predominantly engaged in the regulation of defense response activation;maintenance of cellular metal ion balance;and the production of chemokine ligand 5.These genes were notably associated with signaling pathways such as ferroptosis;porphyrin metabolism;and base excision repair.Immune infiltration analysis showed that key genes were highly correlated with immune cells.Macrophage M0;M1;M2;and resting dendritic cell were significantly different between groups;and there were signifi-cant differences between different groups(P<0.05).The interaction counts of resveratrol;curcumin;and quercetin with KLK3 were 7;3;and 2;respectively.It shows that the interac-tions of resveratrol;curcumin;and quercetin with KLK3 were substantial.Molecular docking and molecular dynamics simulations further confirmed the robust binding affinity of these bioactive compounds to the target genes.Conclusion Pivotal genes including CP;KLK3;POLG;and TRPV4;exhibited commendable significant prognostic value;and played a crucial role in the diagnostic assessment of OP.Resveratrol;curcumin;and quercetin;natural compounds found in TCM;showed promise in their potential to effectively modulate the bone-forming gene KLK3.This study provides a sci-entific basis for the interpretation of the pathogenesis of OP and the development of clinical drugs.
文摘Objective Various treatments have greatly reduced the mortality of hepatocellular carcinoma (HCC). However, few therapies could be performed in advanced HCC. Therefore, understanding the characteristics of HCC at the level of the whole transcriptome can help prevent the progression of HCC. Methods: The aim of this study was to identify differently expressed genes and potent pathways between normal liver and HCC tissues. The gene expression profiles of GSE104627 were downloaded from Gene Expression Omnibus database. The Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses were performed and protein-protein interaction network of the differentially expressed genes were constructed by Cytoscape software. Results: In total, 880 differently expressed genes were identified between normal and tumor tissues, including 554 up-regulated genes and 326 down-regulated genes. Gene Ontology analysis results showed that the up-regulated genes were significantly enriched in establishment of RNA localization, nucleic acid transport, RNA transport, RNA localization and nucleobase, nucleoside, nucleotide and nucleic acid transport. Kyoto Encyclopedia of Genes and Genomes pathway analysis showed the up-regulated genes were enriched in axon guidance, dorso-ventral axis formation and pathways in cancer. The top 10 hub genes were identified from the protein - protein interaction network, and sub-networks revealed these genes were involved in significant pathways, including G protein-coupled receptors signaling pathway, signaling pathway via MAPK and extracellular matrix organization. Conclusion: The present study described the differently expressed genes between normal tissues and HCC tissues from the level of gene transcription. The possible signaling pathways involved in the development of HCC and related molecules involved were analyzed. However, further laboratory and clinical validation is still needed.
文摘The total RNA was extracted from Microtusfortis liver tissue which before being infected and after being infected 10 d and 15 d by the Schistosoma japonicum cercariae. Using Rattus norvegicus CD36 gene probe to hybridize analysis of CD36 difference expression in the Microtus fortis liver tissues which were infected with Schistosorna japonicum before and after being infected. At the same time, the cDNA sequence and encoded amino acid sequence of the Rattus norvegicus CD36 gene and CD36 protein structural domains were analysized by using bioinformatics. The results showed that the CD36 expression levels in the liver tissue of Microtus fortis after being infected were significantly higher than before being infectied. The Rattus norvegicus CD36 cDNA sequence of a total length is 1625 bp and encoded 472 amino acid residues and Rattus norvegicus CD36 protein containing a CD36 superfamily domain.
基金supported by the National Basic Research Program of China (Grant Nos. 2010CB945401, 2007CB108800)National Natural Science Foundation of China (Grant Nos. 30870575,31071162,31000590)Science and Technology Commission of Shanghai Municipality (Grant No. 11DZ2260300)
文摘RNA-Seq technology is becoming widely used in various transcriptomics studies;however,analyzing and interpreting the RNA-Seq data face serious challenges.With the development of high-throughput sequencing technologies,the sequencing cost is dropping dramatically with the sequencing output increasing sharply.However,the sequencing reads are still short in length and contain various sequencing errors.Moreover,the intricate transcriptome is always more complicated than we expect.These challenges proffer the urgent need of efficient bioinformatics algorithms to effectively handle the large amount of transcriptome sequencing data and carry out diverse related studies.This review summarizes a number of frequently-used applications of transcriptome sequencing and their related analyzing strategies,including short read mapping,exon-exon splice junction detection,gene or isoform expression quantification,differential expression analysis and transcriptome reconstruction.
文摘To identify novel genes in castration-resistant prostate cancer(CRPC),we downloaded three microarray datasets containing CRPC and primary prostate cancer in Gene Expression Omnibus(GEO).R packages affy and limma were performed to identify differentially expressed genes(DEGs)between primary prostate cancer and CRPC.After that,we performed functional enrichment analysis including gene ontology(GO)and Kyoto encyclopedia of genes and genomes(KEGG)pathway.In addition,protein–protein interaction(PPI)analysis was used to search for hub genes.Finally,to validate the significance of these genes,we performed survival analysis.As a result,we identified 53 upregulated genes and 58 downregulated genes that changed in at least two datasets.Functional enrichment analysis showed significant changes in the positive regulation of osteoblast differentiation pathway and aldosteroneregulated sodium reabsorption pathway.PPI network identified hub genes like cortactin-binding protein 2(CTTNBP2),Rho family guanosine triphosphatase(GTPase)3(RND3),protein tyrosine phosphatase receptor-type R(PTPRR),Jagged1(JAG1),and lumican(LUM).Based on PPI network analysis and functional enrichment analysis,we identified two genes(PTPRR and JAG1)as key genes.Further survival analysis indicated a relationship between high expression of the two genes and poor prognosis of prostate cancer.In conclusion,PTPRR and JAG1 are key genes in the CRPC,which may serve as promising biomarkers of diagnosis and prognosis of CRPC.