摘要
目的探讨应用关联规则(association rules)挖掘技术筛选对疾病表型相关的基因SNP位点的分析方法。方法应用关联规则挖掘技术,从中国肺癌病例对照研究的核苷酸切除修复通路(nucleotide exicision repair,NER)基因SNP位点筛选与肺癌有关联的SNP位点及其组合,作为logistic回归模型的待选解释变量作进一步确认性研究。结果发现与肺癌易感性可能相关的基因变异位点,并且发现位点ERCC1-rs3212951和ERCC1-rs3212955之间可能存在交互作用。结论关联规则挖掘技术可以作为疾病相关的SNP位点及交互作用的初筛手段,有效地找到肺癌等复杂性状疾病相关的基因多态位点和交互作用项。
Objective To explore the association rules analysis method to find the relationship of diseases and genetic variation data. Methods Association rules mining technology was applied to find the genetic variation locus and/or interactions related with disease phenotype. We discussed how to choice non-redundant rules set, and how to use the locus appeared in these rules to generate a logistic model. The method was illustrated with a real SNP data. Results We found the interaction effect of ERCC1-rs3212951 and ERCC1-rs3212955 may be involved with the lung cancer susceptibility. Conclusion Association roles analysis method can be applied in the high-flux SNP dataset to fred the genetic variation locus and/or interactions related with diseases.
出处
《中国卫生统计》
CSCD
北大核心
2009年第3期226-228,233,共4页
Chinese Journal of Health Statistics
基金
973课题的资助(编号2002CB512902)
关键词
关联规则
基因变异
SNP位点
肺癌
Association rules
Gene variation
SNP locus
Lung cancer