This paper first applies the sequential cluster method to set up the classification standard of infectious disease incidence state based on the fact that there are many uncertainty characteristics in the incidence cou...This paper first applies the sequential cluster method to set up the classification standard of infectious disease incidence state based on the fact that there are many uncertainty characteristics in the incidence course.Then the paper presents a weighted Markov chain,a method which is used to predict the future incidence state.This method assumes the standardized self-coefficients as weights based on the special characteristics of infectious disease incidence being a dependent stochastic variable.It also analyzes the characteristics of infectious diseases incidence via the Markov chain Monte Carlo method to make the long-term benefit of decision optimal.Our method is successfully validated using existing incidents data of infectious diseases in Jiangsu Province.In summation,this paper proposes ways to improve the accuracy of the weighted Markov chain,specifically in the field of infection epidemiology.展开更多
With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistica...With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.展开更多
基金supported in part by"National S&T Major Project Foundation of China"(2009ZX10004-904)Universities Natural Science Foundation of Jiangsu Province(09KJB330004),National Science Foundation Grant DMS-9971405National Institutes of Health Contract N01-HV-28183
文摘This paper first applies the sequential cluster method to set up the classification standard of infectious disease incidence state based on the fact that there are many uncertainty characteristics in the incidence course.Then the paper presents a weighted Markov chain,a method which is used to predict the future incidence state.This method assumes the standardized self-coefficients as weights based on the special characteristics of infectious disease incidence being a dependent stochastic variable.It also analyzes the characteristics of infectious diseases incidence via the Markov chain Monte Carlo method to make the long-term benefit of decision optimal.Our method is successfully validated using existing incidents data of infectious diseases in Jiangsu Province.In summation,this paper proposes ways to improve the accuracy of the weighted Markov chain,specifically in the field of infection epidemiology.
基金founded by the National Natural Science Foundation of China(81202283,81473070,81373102 and81202267)Key Grant of Natural Science Foundation of the Jiangsu Higher Education Institutions of China(10KJA330034 and11KJA330001)+1 种基金the Research Fund for the Doctoral Program of Higher Education of China(20113234110002)the Priority Academic Program for the Development of Jiangsu Higher Education Institutions(Public Health and Preventive Medicine)
文摘With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.