摘要
目的利用生物信息学技术,分析和处理相关高通量数据,查找或建立丙型病毒性肝炎导致肝癌发生相关的基因、蛋白质调控网络以及pathway功能注释信息,建立肝癌蛋白质网络,从而阐释丙型病毒性肝炎相关性肝癌动态发展机制。方法通过文本挖掘技术对数据收集及整理,利用倍数变化(FC)方法对GEO数据库中肝癌相关3组基因表达谱芯片进行分析,计算差异表达基因,得到和肝癌相关的数据信息。从GEO数据库中获取两组基因表达谱芯片,利用FC方法得到肝癌阶段的异常表达基因集,再将得到的基因集投射到蛋白质相互作用关系,以得到相应的蛋白质网络。结果GEO数据库中收集到19张正常组芯片、29张丙型病毒性肝炎肝硬化芯片和16张丙型病毒性肝炎肝硬化伴肝癌芯片,利用FC方法得到肝硬化阶段差异基因1404个,肝癌阶段1129个。对基因集进行了GO、Pathway功能注释富集分析以说明其内在含义并验证疾病相关基因收集的结果。利用收集的数据建立了肝癌发生发展过程中的蛋白质相互作用动态网络。结论利用文本挖掘及生物信息学技术收集了肝癌相关基因和蛋白数据,并建立了肝癌发展过程中的动态蛋白质网络。
Objective To explain the mechanism of dynamic development of Hepatitis C virus (HCV) -related hepatocellular carcinoma (HCC), bioinformatics to analyze and process high -through- put data is used to search or establish genes, protein regulatory networks and pathway functional annotation information to establish HCC protein network. Methods Through text mining technology for data collec- tion and data sorting, use the FC (multiple) method for GEO database related three groups of liver cancer gene expression profile chip were analyzed, calculated the differentially of expressed genes, and get liver cancer related data and information. Get the two groups of gene expression profile chip from the GEO data- base, using the method of FC get cancer of the liver stages of abnormal gene set, again will be projected on the protein gene set interaction relations, in order to get the corresponding protein network. Results GEO database collected 19 normal group chips, 29 pieces of HCV cirrhosis of the liver and 16 HCV liver cirrho- sis with liver cancer chip, FC method are used to get the phase difference gene 1404 cirrhosis, liver cancer stage, 1129. To GO, Pathway gene set functional annotation enrichment analysis to show the inner mean- ings and verify the result of disease related genes to collect. The development of liver cancer was estab- lished based on the data collected in the process of protein interaction dynamic network. Conclusion The study use text mining, and bioinformatics technology data to collect liver cancer related gene and protein, and establish the dynamic in the development of cancer of the liver protein network, the network can pro- vide clues for the follow - up study of cancer of the liver and support, especially can be used to search for the diagnosis and treatment of liver cancer targets.
出处
《中华实验外科杂志》
CAS
CSCD
北大核心
2015年第10期2333-2337,共5页
Chinese Journal of Experimental Surgery
基金
国家自然科学基金资助项目(81172326)
上海慈善癌症研究基金资助项目
关键词
生物信息学
数据库
基因组学
网络
蛋白质组学
肝癌
Bioinformatics
Database
Genomics
Network
Proteomics
Hepatocellular carcinoma