摘要
人类基因测序技术的快速发展,测序成本大幅降低,使基因数据得到广泛的应用,在全基因组的单核苷酸多态性与疾病关联研究中,单核苷酸多态性与患者的身份、表型和血缘关系等敏感信息相关联,单核苷酸多态性连锁不平衡容易导致患者的隐私信息泄露.为此,基于单核苷酸多态性连锁不平衡相关系数,提出矩阵差分隐私保护模型以实现基因数据和单核苷酸多态性连锁不平衡的隐私保护,同时确保基因数据具有一定的效用.该模型可以实现单核苷酸多态性连锁不平衡下全基因组关联研究中基因数据隐私与效用的权衡,并对单核苷酸多态性连锁不平衡下的基因隐私保护具有促进作用.
The cost of sequencing is substantially decreasing with the rapid development of human genome sequencing technologies. The generated genome data are supporting various applications. The genome-wide associated analysis study between the single nucleotide polymorphisms and diseases may lead to more privacy breaches for considering single nucleotide polymorphisms linkage disequilibrium, because of sensitive information related to single nucleotide polymorphisms including individual identity, phenotype, and kinship. To this end, the matrix differential privacy preserving framework is proposed based on the correlated coefficient of single nucleotide polymorphisms linkage disequilibrium. Therefore, this framework can preserve privacy of genome data and single nucleotide polymorphisms linkage disequilibrium, while ensures a certain genome data utility. And it achieves the trade-off between genome data privacy and utility for single nucleotide polymorphisms linkage disequilibrium in genome-wide association studies. Furthermore, the proposed framework plays an important role for promoting genomic privacy preserving under single nucleotide polymorphisms linkage disequilibrium.
作者
刘海
吴振强
彭长根
雷秀娟
LIU Hai;WU Zhen-Qiang;PENG Chang-Gen;LEI Xiu-Juan(School of Computer Science, Shaanxi Normal University, Xi’an 710119, China;Guizhou Provincial Key Laboratory of Public Big Data (Guizhou University), Guiyang 550025, China)
出处
《软件学报》
EI
CSCD
北大核心
2019年第4期1094-1105,共12页
Journal of Software
基金
国家自然科学基金(61173190
61602290
61672334
61662009)
中央高校基本科研业务费专项资金(2016CBY004
GK201704016
GK201501008)
陕西省重点科技创新团队(2014KTC-18)~~
关键词
单核苷酸多态性连锁不平衡
差分隐私
基因隐私
基因数据效用
single nucleotide polymorphisms linkage disequilibrium
differential privacy
genomic privacy
genome data utility