This article proposes the maximum test for a sequence of quadratic form statistics about score test in logistic regression model which can be applied to genetic and medicine fields.Theoretical properties about the max...This article proposes the maximum test for a sequence of quadratic form statistics about score test in logistic regression model which can be applied to genetic and medicine fields.Theoretical properties about the maximum test are derived.Extensive simulation studies are conducted to testify powers robustness of the maximum test compared to other two existed test.We also apply the maximum test to a real dataset about multiple gene variables association analysis.展开更多
The traditional method for creating a gene score to predict a given outcome is to use the most statistically significant single nucleotide polymorphisms (SNPs) from all SNPs which were tested. There are several disadv...The traditional method for creating a gene score to predict a given outcome is to use the most statistically significant single nucleotide polymorphisms (SNPs) from all SNPs which were tested. There are several disadvantages of this approach such as excluding SNPs that do not have strong single effects when tested on their own but do have strong joint effects when tested together with other SNPs. The interpretation of results from the traditional gene score may lack biological insight since the functional unit of interest is often the gene, not the single SNP. In this paper we present a new gene scoring method, which overcomes these problems as it generates a gene score for each gene, and the total gene score for all the genes available. First, we calculate a gene score for each gene and second, we test the association between this gene score and the outcome of interest (i.e. trait). Only the gene scores which are significantly associated with the outcome after multiple testing correction for the number of gene tests (not SNPs) are considered in the total gene score calculation. This method controls false positive results caused by multiple tests within genes and between genes separately, and has the advantage of identifying multi-locus genetic effects, compared with the Bonferroni correction, false discovery rate (FDR), and permutation tests for all SNPs. Another main feature of this method is that we select the SNPs, which have different effects within a gene by using adjustment in multiple regressions and then combine the information from the selected SNPs within a gene to create a gene score. A simulation study has been conducted to evaluate finite sample performance of the proposed method.展开更多
In order to improve the fitting accuracy of college students’ test scores, this paper proposes two-component mixed generalized normal distribution, uses maximum likelihood estimation method and Expectation Conditiona...In order to improve the fitting accuracy of college students’ test scores, this paper proposes two-component mixed generalized normal distribution, uses maximum likelihood estimation method and Expectation Conditional Maxinnization (ECM) algorithm to estimate parameters and conduct numerical simulation, and performs fitting analysis on the test scores of Linear Algebra and Advanced Mathematics of F University. The empirical results show that the two-component mixed generalized normal distribution is better than the commonly used two-component mixed normal distribution in fitting college students’ test data, and has good application value.展开更多
Count data with excess zeros encountered in many applications often exhibit extra variation. There- fore, zero-inflated Poisson (ZIP) model may fail to fit such data. In this paper, a zero-inflated double Poisson mo...Count data with excess zeros encountered in many applications often exhibit extra variation. There- fore, zero-inflated Poisson (ZIP) model may fail to fit such data. In this paper, a zero-inflated double Poisson model (ZIDP), which is generalization of the ZIP model, is studied and the score tests for the significance of dis- persion and zero-inflation in ZIDP model are developed. Meanwhile, this work also develops homogeneous tests for dispersion and/or zero-inflation parameter, and corresponding score test statistics are obtained. One numer- ical example is given to illustrate our methodology and the properties of score test statistics are investigated through Monte Carlo simulations.展开更多
Cardiovascular disease(CVD) is the leading cause of morbidity and mortality among patients with diabetes mellitus,who have a risk of cardiovascular mortality two to four times that of people without diabetes.An indivi...Cardiovascular disease(CVD) is the leading cause of morbidity and mortality among patients with diabetes mellitus,who have a risk of cardiovascular mortality two to four times that of people without diabetes.An individualised approach to cardiovascular risk estimation and management is needed.Over the past decades,many risk scores have been developed to predict CVD.However,few have been externally validated in a diabetic population and limited studies have examined the impact of applying a prediction model in clinical practice.Currently,guidelines are focused on testing for CVD in symptomatic patients.Atypical symptoms or silent ischemia are more common in the diabetic population,and with additional markers of vascular disease such as erectile dysfunction and autonomic neuropathy,these guidelines can be difficult to interpret.We propose an algorithm incorporating cardiovascular risk scores in combination with typical and atypical signs and symptoms to alert clinicians to consider further investigation with provocative testing.The modalities for investigation of CVD are discussed.展开更多
Objective: To improve the detecting accuracy of chromosomal aneuploidy of fetus by non-invasive prenatal testing (NIPT) using next generation sequencing data of pregnant women’s cell-free DNA. Methods: We proposed th...Objective: To improve the detecting accuracy of chromosomal aneuploidy of fetus by non-invasive prenatal testing (NIPT) using next generation sequencing data of pregnant women’s cell-free DNA. Methods: We proposed the multi-Z method which uses 21 z-scores for each autosomal chromosome to detect aneuploidy of the chromosome, while the conventional NIPT method uses only one z-score. To do this, mapped read numbers of a certain chromosome were normalized by those of the other 21 chromosomes. Average and standard deviation (SD), which are used for calculating z-score of each sample, were obtained with normalized values between all autosomal chromosomes of control samples. In this way, multiple z-scores can be calculated for 21 autosomal chromosomes except oneself. Results: Multi-Z method showed 100% sensitivity and specificity for 187 samples sequenced to 3 M reads while the conventional NIPT method showed 95.1% specificity. Similarly, for 216 samples sequenced to 1 M reads, Multi-Z method showed 100% sensitivity and 95.6% specificity and the conventional NIPT method showed a result of 75.1% specificity. Conclusion: Multi-Z method showed higher accuracy and robust results than the conventional method even at low coverage reads.展开更多
In this paper, it is discussed that two tests for varying dispersion of binomial data in the framework of nonlinear logistic models with random effects, which are widely used in analyzing longitudinal binomial data. O...In this paper, it is discussed that two tests for varying dispersion of binomial data in the framework of nonlinear logistic models with random effects, which are widely used in analyzing longitudinal binomial data. One is the individual test and power calculation for varying dispersion through testing the randomness of cluster effects, which is extensions of Dean(1992) and Commenges et al (1994). The second test is the composite test for varying dispersion through simultaneously testing the randomness of cluster effects and the equality of random-effect means. The score test statistics are constructed and expressed in simple, easy to use, matrix formulas. The authors illustrate their test methods using the insecticide data (Giltinan, Capizzi & Malani (1988)).展开更多
基金This work of Jiayan Zhu is partially supported by seeding project funding(2019ZZX026)scientific research project funding of talent recruitment,and start up funding for scientific research of Hubei University of Chinese MedicineThis work of Zhengbang Li is partially supported by self-determined research funds of Central China Normal University from colleges'basic research of MOE(CCNU18QN031).
文摘This article proposes the maximum test for a sequence of quadratic form statistics about score test in logistic regression model which can be applied to genetic and medicine fields.Theoretical properties about the maximum test are derived.Extensive simulation studies are conducted to testify powers robustness of the maximum test compared to other two existed test.We also apply the maximum test to a real dataset about multiple gene variables association analysis.
文摘The traditional method for creating a gene score to predict a given outcome is to use the most statistically significant single nucleotide polymorphisms (SNPs) from all SNPs which were tested. There are several disadvantages of this approach such as excluding SNPs that do not have strong single effects when tested on their own but do have strong joint effects when tested together with other SNPs. The interpretation of results from the traditional gene score may lack biological insight since the functional unit of interest is often the gene, not the single SNP. In this paper we present a new gene scoring method, which overcomes these problems as it generates a gene score for each gene, and the total gene score for all the genes available. First, we calculate a gene score for each gene and second, we test the association between this gene score and the outcome of interest (i.e. trait). Only the gene scores which are significantly associated with the outcome after multiple testing correction for the number of gene tests (not SNPs) are considered in the total gene score calculation. This method controls false positive results caused by multiple tests within genes and between genes separately, and has the advantage of identifying multi-locus genetic effects, compared with the Bonferroni correction, false discovery rate (FDR), and permutation tests for all SNPs. Another main feature of this method is that we select the SNPs, which have different effects within a gene by using adjustment in multiple regressions and then combine the information from the selected SNPs within a gene to create a gene score. A simulation study has been conducted to evaluate finite sample performance of the proposed method.
文摘In order to improve the fitting accuracy of college students’ test scores, this paper proposes two-component mixed generalized normal distribution, uses maximum likelihood estimation method and Expectation Conditional Maxinnization (ECM) algorithm to estimate parameters and conduct numerical simulation, and performs fitting analysis on the test scores of Linear Algebra and Advanced Mathematics of F University. The empirical results show that the two-component mixed generalized normal distribution is better than the commonly used two-component mixed normal distribution in fitting college students’ test data, and has good application value.
基金Supported in part by the National Natural Science Foundation of China under Grant No.11271193 and 11571073the Natural Science Foundation of Jiangsu Province under Grant No.BK20141326
文摘Count data with excess zeros encountered in many applications often exhibit extra variation. There- fore, zero-inflated Poisson (ZIP) model may fail to fit such data. In this paper, a zero-inflated double Poisson model (ZIDP), which is generalization of the ZIP model, is studied and the score tests for the significance of dis- persion and zero-inflation in ZIDP model are developed. Meanwhile, this work also develops homogeneous tests for dispersion and/or zero-inflation parameter, and corresponding score test statistics are obtained. One numer- ical example is given to illustrate our methodology and the properties of score test statistics are investigated through Monte Carlo simulations.
文摘Cardiovascular disease(CVD) is the leading cause of morbidity and mortality among patients with diabetes mellitus,who have a risk of cardiovascular mortality two to four times that of people without diabetes.An individualised approach to cardiovascular risk estimation and management is needed.Over the past decades,many risk scores have been developed to predict CVD.However,few have been externally validated in a diabetic population and limited studies have examined the impact of applying a prediction model in clinical practice.Currently,guidelines are focused on testing for CVD in symptomatic patients.Atypical symptoms or silent ischemia are more common in the diabetic population,and with additional markers of vascular disease such as erectile dysfunction and autonomic neuropathy,these guidelines can be difficult to interpret.We propose an algorithm incorporating cardiovascular risk scores in combination with typical and atypical signs and symptoms to alert clinicians to consider further investigation with provocative testing.The modalities for investigation of CVD are discussed.
文摘Objective: To improve the detecting accuracy of chromosomal aneuploidy of fetus by non-invasive prenatal testing (NIPT) using next generation sequencing data of pregnant women’s cell-free DNA. Methods: We proposed the multi-Z method which uses 21 z-scores for each autosomal chromosome to detect aneuploidy of the chromosome, while the conventional NIPT method uses only one z-score. To do this, mapped read numbers of a certain chromosome were normalized by those of the other 21 chromosomes. Average and standard deviation (SD), which are used for calculating z-score of each sample, were obtained with normalized values between all autosomal chromosomes of control samples. In this way, multiple z-scores can be calculated for 21 autosomal chromosomes except oneself. Results: Multi-Z method showed 100% sensitivity and specificity for 187 samples sequenced to 3 M reads while the conventional NIPT method showed 95.1% specificity. Similarly, for 216 samples sequenced to 1 M reads, Multi-Z method showed 100% sensitivity and 95.6% specificity and the conventional NIPT method showed a result of 75.1% specificity. Conclusion: Multi-Z method showed higher accuracy and robust results than the conventional method even at low coverage reads.
基金The project supported by NNSFC (19631040), NSSFC (04BTJ002) and the grant for post-doctor fellows in SELF.
文摘In this paper, it is discussed that two tests for varying dispersion of binomial data in the framework of nonlinear logistic models with random effects, which are widely used in analyzing longitudinal binomial data. One is the individual test and power calculation for varying dispersion through testing the randomness of cluster effects, which is extensions of Dean(1992) and Commenges et al (1994). The second test is the composite test for varying dispersion through simultaneously testing the randomness of cluster effects and the equality of random-effect means. The score test statistics are constructed and expressed in simple, easy to use, matrix formulas. The authors illustrate their test methods using the insecticide data (Giltinan, Capizzi & Malani (1988)).