摘要
目的比较二代测序方法和主流芯片检测已知疾病相关SNP特别是癌症相关SNP检出率的差异。方法下载GWAS catalog和23and Me、Human Omni5、Genome Wide SNP6、Human Exome12等主流生物芯片的检测突变数据集和进行基因型填充之后的突变数据集,在Linux系统下用命令和脚本提取rsid列表,对格式进行统一处理,并去除错误、冗余的数据,再各自提取生物芯片和GWAS catalog中与癌症发病、发展、耐药相关的SNP的risd,最后将各生物芯片检出的SNP rsid集和GWAS catalog rsid数据集进行比较并计算检出率。结果各生物芯片检测所有疾病相关或仅癌症相关SNP的检出率偏低。进行基因型填充后,检出率有明显提升。结论虽然以生物芯片进行基因型填充可以提高已知SNP的检出率,但二代测序是进行较高质量个人疾病分析、健康风险管理和医疗服务的更好手段,生物芯片更适合针对部分相关度比较高的疾病SNP进行定制开发,进行定向疾病风险预测。
Objective To compare the detection rates of Next Generation Sequencing and Microarray technologies of diseases related SNP, especially the cancer related SNP. Method SNP listings of GWAS catalog and other major microarrays, such as 23 and Me, Human Omni5, Genome Wide SNP6, and Human Exome12, were downloaded, and microarray imputation data were also collected. Then rsids were extracted from each data set and the format of the results were unified,with redundant or error data removed using Linux by commands and scripts. SNP and rsids, especially those related with the genesis, development, and drug resistance of cancer were also collected. Lastly, rsids from microarrays were compared with rsids from GWAS catalog, and the checkout ratio was calculated. Result The checkout frequencies from both the whole and the cancer specified SNP in each major commercial microarray were low, but the frequencies were increased after data imputation. Conclusion Although the disease related SNP checkout frequency of microarray could be raised by gene imputation, the 2 ndgeneration sequencing and GWAS study are the methods with higher quality for disease analysis, health management and medical service. Microarray is more suitable for the detection of diseases' notably related SNP, and are used in disease prediction in a confined and directed usage.
出处
《癌症进展》
2017年第10期1139-1141,1149,共4页
Oncology Progress
基金
国家科技支撑计划项目课题(2015BAH09F03)
关键词
二代测序
生物芯片
分子诊断
生物信息
癌症
next generation sequence
microarray
molecular diagnosis
bio-information
cancer