摘要
目的通过生物信息学方法筛选早期非小细胞肺癌(non-smallcelllungcancer,NSCLC)高复发风险的潜在标志基因。方法在Gene Expression Omnibus数据库(GEO)选定NSCLC患者基因表达芯片GSE19804,利用在线GEO2R软件确定早期NSCLC中的差异表达基因(differentially expressed genes,DEGs)及在早期和晚期中的DEGs;另选定基因表达芯片GSE30219分析NSCLC复发高风险的DEGs;利用DAVID在线数据库对上述确定的与早期NSCLC复发相关的DEGs进行Gene Ontology(GO)和信号通路富集分析;使用STRING在线数据库进行蛋白相互作用(proteinprotein interaction,PPI)网络构建,再利用SYTOSCAPE软件分析获得高得分的枢纽基因,并进行PPI网络的模块分析;采用Kaplan-Meier plotter在线数据库进行生存分析,验证上述枢纽基因可靠性。结果获得早期NSCLC差异基因441个,在NSCLC早期与晚期表达差异的基因26个,与复发相关差异基因126个,最终获得了早期NSCLC高复发风险相关DEGs 10个;富集分析显示,DEGs富集在细胞分裂、细胞周期、细胞增殖和p53信号途径;构建差异基因PPI网络后得到6个枢纽基因:TOP2A、RRM2、CCNB1、DLGAP5、ANLN和CDCA7;生存分析显示,枢纽基因的高表达与患者较差的总生存率显著相关。结论 TOP2A、RRM2、CCNB1、DLGAP5、ANLN和CDCA7可能作为潜在的早期NSCLC高复发风险生物标记物。
Objective To screen the potential biomarkers of early stage non-small cell lung cancer(NSCLC) with high risk of recurrence by bioinformatics method. Methods The gene expression profiles of GSE19804 was selected from Gene Expression Omnibus database(GEO). Differentially expressed genes(DEGs)of early stage NSCLC as well as DEGs at early and late stages were obtained by using GEO2 R software. The gene ontology(GO) and pathway enrichment analyses were performed on the DEGs related to the recurrence of NSCLC by using DAVID database. Protein-protein interaction(PPI)network was established by STRING online database, while the hub genes with high scores were analyzed by SYTOSCAPE software, and module analysis of the PPI network was performed. Overall survival analysis of hub genes was performed by the Kaplan-Meier plotter online tool to verify the reliability of hub genes. Results A total of 441 DEGs of early stage NSCLC were obtained, of which 26 were differentially expressed in early and in late stages, while 126 were related to recurrence. Eventually, 10 DEGs related to high risk of recurrence of early stage NSCLC were identified.Enrichment analysis results showed that DEGs were significantly enriched in cell division, cell cycle, cell proliferation and p53 signaling pathway. The top 6 hub genes, TOP2 A, RRM2, CCNB1, DLGAP5, ANLN, and CDCA7, were identified from the PPI network. Survival analysis showed that the high expression of hub genes was significantly associated with poor overall survival. Conclusion TOP2 A, RRM2, CCNB1, DLGAP5, ANLN and CDCA7 may beserved as potential biomarkers of early stage NSCLC with high risk of recurrence.
出处
《中国生物制品学杂志》
CAS
CSCD
2018年第1期35-40,共6页
Chinese Journal of Biologicals
关键词
非小细胞肺癌
生物信息学
基因表达芯片
复发风险
Non-small cell lung cancer (NSCLC)
Bioinformatics
Gene expression microarray
Recurrence risk