摘要
目的探索组织学正常的乳腺上皮组织中的异常基因对于乳腺癌早期诊断的意义。方法应用基因芯片技术对乳腺癌患者组织学正常的上皮细胞和正常人的上皮细胞进行生物信息学分析,找寻异常表达的基因。并以差异表达基因建立乳腺癌早期诊断模型,用信号通路富集的方法筛选差异表达基因。用准确度(Ac)、敏感度(Sn)以及特异度(Sp)衡量不同方法的预测精度。结果差异表达基因主要富集在转录以及MAPK信号通路上。用KEGG信号通路中富集到的基因作为特征值建立模型的预测精度优于BioCarta信号通路。将KEGG和BioCarta中富集到的基因合并起来共同作为特征值,其预测精度与将所有差异表达基因作为特征值建立的模型精度一致(Ac:96.3%;Sn:92.3%;Sp:100%),但是特征值却分别从22个缩减到7个,14个缩减到3个,18个缩减到4个。KEGG和BioCarta中富集到的基因包括JUN、DUSP1、BTG2、FOSB、JUND、E1F1和FOS。结论用通路富集的方法过滤差异表达基因,可在保证预测精度的前提下简化预测模型;KEGG和BioCarta中富集到的基因表达水平可作为乳腺癌的早期诊断标准。
Objective To investigate the significance of abnormal gene in histologically normal mammary epithelial tissue for the early diagnosis of breast cancer. Methods Microarray technology was used to identify abnormal gene expres-sion and analyzed bioinformatics of normal mammary epithelial tissue in breast cancer patients and healthy normal control to establish a model for early diagnosis of breast cancer. The differentially expressed genes were screened by using signal path-way enrichment analysis. The accuracy (Ac), sensitivity (Sn) and specificity (Sp) were used to measure the prediction accura-cy of the different methods. Results The best prediction model was derived from the combination of differential genes en-riched from KEGG and BioCarta database. The number of differential expressed genes in three random created prediction models was reduced from 22 to 7, 14 to 3 and 18 to 4. However, the prediction accuracy was consistent with the model estab-lished from all of the differentially expressed genes, and the average accuracy of all models was 96.3%. Conclusion The prediction model can be simplified with the prediction accuracy unchanged, and thus facilitate the model apply to early diag-nosis and prevention of breast cancer.
出处
《天津医药》
CAS
北大核心
2014年第5期414-416,共3页
Tianjin Medical Journal
基金
国家自然基金青年科学基金项目(项目编号:81001020)
关键词
乳腺肿瘤
基因表达
芯片分析技术
基因表达谱
预测
早期诊断
通路
breast neoplasms
gene expression
microchip analytical procedures
gene expression profiling
forecast-ing
early diagnosis
pathway