期刊文献+

基于融合模型的遗传位点分析

Genetic Locus Analysis Based on Fusion Model
下载PDF
导出
摘要 为了解决人类遗传性疾病和性状与基因组上位点间的关联性问题,通过全基因组关联分析,提出一种融合模型,建立了单核苷酸多态性(Single Nucleotide Polymorphisms,SNP)与疾病的关联分析。首先,将16维数据做降维处理;以位点集与类标间的卡方统计量作为评价函数建立基于二阶段蚁群算法的SNP关联分析模型;选取与致病位点相似度最大的位点,构成新的位点集合,建立二元逻辑回归模型,分析遗传疾病与新的位点集合的关联性;并使用随机森林算法验证该模型的准确率。数据测试验证表明了此融合模型的识别率达到85.8%,该模型比传统方法的识别能力有明显增强,可以有效地进行遗传疾病、基因和位点多层次相关性分析。 In order to solve the relationship between human genetic diseases and traits and genomic loci, a fusion model is proposed to establish the association analysis between single Nucleotide Polymorphisms (SNPs) and diseases through genome-wide association analysis. Firstly, the 16-dimensional data is transformed into coding mode to obtain the dimensionality reduction data. Next, the SNP correlation analysis model based on the two-stage ant colony algorithm is established by using the chi-squared statistic between locus set and class standard as the evaluation function. Then, the most similar site to the pathogenic site is selected, which as well as other sites constitutes a new set of loci and establishes a binary logistic regression model, and the association between genetic diseases and new locus sets is analyzed. Finally, the random forest algorithm is used to verify the accuracy of the model. The experimental results show that this fusion model, whose recognition rate reaches 85.8%, is significantly enhanced compared with the recognition ability of the traditional method, and it can effectively carry out genetic disease, gene and site multi-level correlation analysis.
作者 张继荣 寇磊 ZHANG Jirong;KOU Lei(School of Communication and Information Engineering, Xi'an University of Posts and Telecommunications, Xi'an 710121)
出处 《计算机与数字工程》 2019年第9期2165-2169,2175,共6页 Computer & Digital Engineering
关键词 遗传位点 二阶段蚁群算法 随机森林 逻辑回归分析 卡方检测 genetic locus two-stage ant colony algorithm random forest logistic regression analysis chi-square test
  • 相关文献

参考文献7

二级参考文献60

  • 1印勇,孙如英.基于模糊粗糙集的一种知识获取方法[J].重庆大学学报(自然科学版),2006,29(5):108-111. 被引量:4
  • 2薛毅,陈立萍.R软件建模与R软件[M].北京:清华大学出版社,2007.
  • 3Lowe G Roscoe B.Using CSP to Detect ErrorS in the TMN Protocol IEEE Transactions on Software Engineering,1997,23(10):659—669.
  • 4Thorhuus R Software Fault Injection Testing Mater Thesis ELE/ESK/2000—2 Ericsson Telecom.Stockholm.Sweden,2000—02.
  • 5Howard J D.An Analysis of Security Incident on the Internet[Ph D dissertationl.Carnegie Mellon University.West Lafayette,USA,1997.
  • 6Horton J D,Harland R,Ashby E,et a1.The Cascade Vulnerability Problem.In:Proceedings of IEEE Computer Society Symposium on Research in Security and Privacy,1993:110—116.
  • 7Cooper G Computational Complexity of Probabilistic In—ference Using Bayesian Belief Networks(Research Note)[J].Artificial Intelligence,1990,42(2/3):393—405.
  • 8Roiger R J, Geatz M W. Data mining: A tutorial-based primer[ M ]. Pearson Education, Inc. Publishing as Addison-Wesiy, 2003:4 -5.
  • 9[美]Klein L A.多传感器数据融合理论及应用[M].戴亚萍,刘征,郁光辉,译.北京:北京理工大学出版社,2004:52-53.
  • 10Richard Jensen, Qiang Shen. Semantics-preserving dimensionality reduction rough and fuzzy-rough-based approached [ J ]. IEEE Transactions on Knowledge and Data Engineetring, 2004, 16 (12) :1457-1471.

共引文献275

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部