摘要
The risks of developing complex diseases are likely to be determined by single nucleotide polymorphisms (SNPs), which are the most common form of DNA variations. Rapidly developing genotyping technologies have made it possible to assess the influence of SNPs on a particular disease. The aim of this paper is to identify the risk/protective factors of a disease, which are modeled as a subset of SNPs (with specified alleles) with the maximum odds ratio. On the basis of risk/protective factor and the relationship between nucleotides and amino acids, two novel risk/protective factors (called k-relaxed risk/protective factors and weighted-relaxed risk/protective factors) are proposed to consider more complex disease-associated SNPs. However, the enormous amount of possible SNPs interactions presents a mathematical and computational challenge. In this paper, we use the Bayesian Optimization Algorithm (BOA) to search for the risk/protective factors of a particular disease. Determining the Bayesian network (BN) structure is NP-hard; therefore, the binary particle swarm optimization was used to determine the BN structure. The proposed algorithm was tested on four datasets. Experimental results showed that the algorithm proposed in this paper is a promising method for discovering SNPs interactions that cause/prevent diseases.
The risks of developing complex diseases are likely to be determined by single nucleotide polymorphisms (SNPs), which are the most common form of DNA variations. Rapidly developing genotyping technologies have made it possible to assess the influence of SNPs on a particular disease. The aim of this paper is to identify the risk/protective factors of a disease, which are modeled as a subset of SNPs (with specified alleles) with the maximum odds ratio. On the basis of risk/protective factor and the relationship between nucleotides and amino acids, two novel risk/protective factors (called k-relaxed risk/protective factors and weighted-relaxed risk/protective factors) are proposed to consider more complex disease-associated SNPs. However, the enor- mous amount of possible SNPs interactions presents a mathematical and computational challenge. In this paper, we use the Bayesian Optimization Algorithm (BOA) to search for the risk/protective factors of a particular disease. Determining the Bayesi- an network (BN) structure is NP-hard; therefore, the binary particle swarm optimization was used to determine the BN structure. The proposed algorithm was tested on four datasets. Experimental results showed that the algorithm proposed in this paper is a promising method for discovering SNPs interactions that cause/prevent diseases.
基金
supported by the National Natural Science Foundation of China(60774086 and 61173111)
Ph.D.Program Foundation of Ministry of Education of China(20090201110027)