摘要
如何从海量基因信息中高效挖掘出遗传性疾病密切相关的基因位点是全基因组关联性分析的核心问题.然而,目前常用的致病基因位点选择单变量分析技术不能发现多基因复杂交互形成的致病机制.为此,本文尝试从多变量分析角度,采用稀疏优化模型,实现致病基因的选择.为进一步实现致病基因位点的选择,采用基于L12范数的最小组稀疏角回归算法,通过调整正则化系数大小来控制模型的组间稀疏度,最终有效实现了致病基因和基因位点的选择.最后,通过某遗传疾病的真实基因数据,验证了该方法的有效性.
How to extract the genetic locus closely related to Hereditary Diseases from mass genetic information is the core problem of Genome Wide Association Study(GWAS).However,the common single variable analysis method of disease-causing genes selection cannot detect the pathogenesis caused by multi-gene complex interactions.For purpose,in this paper,we try to cany out disease-causing genes selection through group optimize model in point of multi-variable analysis.In order to get the disease-causing genes locus selection,we use the Group Sparse algorithm based on L12 norm,then control the modeFs group sp肌e degree toough adjusting the regularization size,肪d realize the selection of disease-causing genes肪d disease-causing locus effectively.Finally,the validity of this method is verified by the real genetic data of a genetic disease.
作者
张跻
杨勃
ZHANG Ji;YANG Bo(College of Information Science and Engineering,Hunan Institute of Science and Technology,Yueyang 414006,China)
出处
《湖南理工学院学报(自然科学版)》
CAS
2018年第1期12-15,63,共5页
Journal of Hunan Institute of Science and Technology(Natural Sciences)