This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on m...This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on medical research. Thirty seven research articles published between 2000 and 2018 which employed logistic regression as the main statistical tool as well as six text books on logistic regression were reviewed. Logistic regression concepts such as odds, odds ratio, logit transformation, logistic curve, assumption, selecting dependent and independent variables, model fitting, reporting and interpreting were presented. Upon perusing the literature, considerable deficiencies were found in both the use and reporting of LR. For many studies, the ratio of the number of outcome events to predictor variables (events per variable) was sufficiently small to call into question the accuracy of the regression model. Also, most studies did not report on validation analysis, regression diagnostics or goodness-of-fit measures;measures which authenticate the robustness of the LR model. Here, we demonstrate a good example of the application of the LR model using data obtained on a cohort of pregnant women and the factors that influence their decision to opt for caesarean delivery or vaginal birth. It is recommended that researchers should be more rigorous and pay greater attention to guidelines concerning the use and reporting of LR models.展开更多
Multinomial logistic regression (MNL) is an attractive statistical approach in modeling the vehicle crash severity as it does not require the assumption of normality, linearity, or homoscedasticity compared to other a...Multinomial logistic regression (MNL) is an attractive statistical approach in modeling the vehicle crash severity as it does not require the assumption of normality, linearity, or homoscedasticity compared to other approaches, such as the discriminant analysis which requires these assumptions to be met. Moreover, it produces sound estimates by changing the probability range between 0.0 and 1.0 to log odds ranging from negative infinity to positive infinity, as it applies transformation of the dependent variable to a continuous variable. The estimates are asymptotically consistent with the requirements of the nonlinear regression process. The results of MNL can be interpreted by both the regression coefficient estimates and/or the odd ratios (the exponentiated coefficients) as well. In addition, the MNL can be used to improve the fitted model by comparing the full model that includes all predictors to a chosen restricted model by excluding the non-significant predictors. As such, this paper presents a detailed step by step overview of incorporating the MNL in crash severity modeling, using vehicle crash data of the Interstate I70 in the State of Missouri, USA for the years (2013-2015).展开更多
Background: Binary as well as polytomous logistic models are widely used for estimating odds ratios when the exposure of prime interest assumes unordered multiple levels under matched pairs case-control design. In our...Background: Binary as well as polytomous logistic models are widely used for estimating odds ratios when the exposure of prime interest assumes unordered multiple levels under matched pairs case-control design. In our previous studies, we have shown that the use of a polytomous logistic model for estimating cumulative odds ratios when the outcome (response) variable is ordinal (in addition to being polytomous) under matched pairs case-control design. The cumulative odds ratios were estimated based on separate fitting of the model at each of the cutpoint level as compared to less than equal to that level. In this paper we propose an alternative method of estimating the cumulative odds ratios and reanalyze the Los Angeles Endometrial Cancer data in the context of dose levels of conjugated oestrogen exposure and development of endometrial cancer under the matched pair case-control design. Methods: In the present study, the cumulative logit model is fitted using a single multinomial logit model for the data. For this, the full maximum likelihood estimation procedure is adopted. A test for equality of the cumulative odds ratios across the exposure levels is proposed. Results: The analysis revealed that there is a strong evidence of risk for developing endometrial cancer due to oestrogen exposure above each of the three dose level as compared to less than equal to that level. The estimated values at the three cutpoint levels were found to be 6.17, 3.60 and 5.16 respectively. Conclusions: The odds of developing endometrial cancer are very high for the users of any amount of oestrogen, even if it is the least dose, as compared to the non-users.展开更多
Caesarean is a major surgical procedure undertaken in obstetrics and its rate is increasing in Sri Lanka as well as in the world. In Sri Lanka, health statistics show an increase in caesarean rates from 13.3% in 1998 ...Caesarean is a major surgical procedure undertaken in obstetrics and its rate is increasing in Sri Lanka as well as in the world. In Sri Lanka, health statistics show an increase in caesarean rates from 13.3% in 1998 to 30.6% by 2007. Due to its potential serious risks and the burden on the health system, many authorities have recommended reducing the caesarean rate for many years. This motivated us to study the most influential variables on the type of birth in Sri Lanka. In this study, based on the Anuradhapura Teaching Hospital records, entire 805 new born babies’ birth information was considered during the month of May, 2015. The variable “Type of Birth” (Normal/Caesarean) was considered as a binary response variable and age, height, weight of mother and sex, weight, length, shoulder length, head circumference of the baby were treated as explanatory variables. Logistic regression was used to model the data and using stepwise regression;mother’s age, height and weight were identified as the most influential variables on the type of birth. Further, it was observed that the odds of having normal birth is 80% higher for woman who belongs to the age group less than or equal to 30 years compared to woman who belongs to the age group greater than 30 years. Moreover, the Hosmer-Lemeshow goodness of fit test was used to check the adequacy of the fitted model. Results from this study revealed that in future, the type of birth may be predicted by considering these identified influential variables.展开更多
The adjacent-categories, continuation-ratio and proportional odds logit-link regression models provide useful extensions of the multinomial logistic model to ordinal response data. We propose fitting these models with...The adjacent-categories, continuation-ratio and proportional odds logit-link regression models provide useful extensions of the multinomial logistic model to ordinal response data. We propose fitting these models with a logarithmic link to allow estimation of different forms of the risk ratio. Each of the resulting ordinal response log-link models is a constrained version of the log multinomial model, the log-link counterpart of the multinomial logistic model. These models can be estimated using software that allows the user to specify the log likelihood as the objective function to be maximized and to impose constraints on the parameter estimates. In example data with a dichotomous covariate, the unconstrained models produced valid coefficient estimates and standard errors, and the constrained models produced plausible results. Models with a single continuous covariate performed well in data simulations, with low bias and mean squared error on average and appropriate confidence interval coverage in admissible solutions. In an application to real data, practical aspects of the fitting of the models are investigated. We conclude that it is feasible to obtain adjusted estimates of the risk ratio for ordinal outcome data.展开更多
This paper focuses on the quantitative expression of bacterial regrowth in water distribution system. Considering public health risks of bacterial regrowth,the experiment was performed on a distribution system of sele...This paper focuses on the quantitative expression of bacterial regrowth in water distribution system. Considering public health risks of bacterial regrowth,the experiment was performed on a distribution system of selected area.Physical,chemical,and microbiological parameters such as turbidity,temperature,residual chlorine and pH were measured over a three-month period and correlation analysis was carried out.Combined with principal components analysis(PCA) ,a logistic regression model is developed to predict and diagnose bacterial regrowth and locate the zones with high risks of microbiology in the distribution system.The model gives the probability of bacterial regrowth with the number of heterotrophic plate counts as the binary response variable and three new principal components variables as the explanatory variables.The veracity of the logistic regression model was 90%,which meets the precision requirement of the model.展开更多
文摘This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on medical research. Thirty seven research articles published between 2000 and 2018 which employed logistic regression as the main statistical tool as well as six text books on logistic regression were reviewed. Logistic regression concepts such as odds, odds ratio, logit transformation, logistic curve, assumption, selecting dependent and independent variables, model fitting, reporting and interpreting were presented. Upon perusing the literature, considerable deficiencies were found in both the use and reporting of LR. For many studies, the ratio of the number of outcome events to predictor variables (events per variable) was sufficiently small to call into question the accuracy of the regression model. Also, most studies did not report on validation analysis, regression diagnostics or goodness-of-fit measures;measures which authenticate the robustness of the LR model. Here, we demonstrate a good example of the application of the LR model using data obtained on a cohort of pregnant women and the factors that influence their decision to opt for caesarean delivery or vaginal birth. It is recommended that researchers should be more rigorous and pay greater attention to guidelines concerning the use and reporting of LR models.
文摘Multinomial logistic regression (MNL) is an attractive statistical approach in modeling the vehicle crash severity as it does not require the assumption of normality, linearity, or homoscedasticity compared to other approaches, such as the discriminant analysis which requires these assumptions to be met. Moreover, it produces sound estimates by changing the probability range between 0.0 and 1.0 to log odds ranging from negative infinity to positive infinity, as it applies transformation of the dependent variable to a continuous variable. The estimates are asymptotically consistent with the requirements of the nonlinear regression process. The results of MNL can be interpreted by both the regression coefficient estimates and/or the odd ratios (the exponentiated coefficients) as well. In addition, the MNL can be used to improve the fitted model by comparing the full model that includes all predictors to a chosen restricted model by excluding the non-significant predictors. As such, this paper presents a detailed step by step overview of incorporating the MNL in crash severity modeling, using vehicle crash data of the Interstate I70 in the State of Missouri, USA for the years (2013-2015).
文摘Background: Binary as well as polytomous logistic models are widely used for estimating odds ratios when the exposure of prime interest assumes unordered multiple levels under matched pairs case-control design. In our previous studies, we have shown that the use of a polytomous logistic model for estimating cumulative odds ratios when the outcome (response) variable is ordinal (in addition to being polytomous) under matched pairs case-control design. The cumulative odds ratios were estimated based on separate fitting of the model at each of the cutpoint level as compared to less than equal to that level. In this paper we propose an alternative method of estimating the cumulative odds ratios and reanalyze the Los Angeles Endometrial Cancer data in the context of dose levels of conjugated oestrogen exposure and development of endometrial cancer under the matched pair case-control design. Methods: In the present study, the cumulative logit model is fitted using a single multinomial logit model for the data. For this, the full maximum likelihood estimation procedure is adopted. A test for equality of the cumulative odds ratios across the exposure levels is proposed. Results: The analysis revealed that there is a strong evidence of risk for developing endometrial cancer due to oestrogen exposure above each of the three dose level as compared to less than equal to that level. The estimated values at the three cutpoint levels were found to be 6.17, 3.60 and 5.16 respectively. Conclusions: The odds of developing endometrial cancer are very high for the users of any amount of oestrogen, even if it is the least dose, as compared to the non-users.
文摘Caesarean is a major surgical procedure undertaken in obstetrics and its rate is increasing in Sri Lanka as well as in the world. In Sri Lanka, health statistics show an increase in caesarean rates from 13.3% in 1998 to 30.6% by 2007. Due to its potential serious risks and the burden on the health system, many authorities have recommended reducing the caesarean rate for many years. This motivated us to study the most influential variables on the type of birth in Sri Lanka. In this study, based on the Anuradhapura Teaching Hospital records, entire 805 new born babies’ birth information was considered during the month of May, 2015. The variable “Type of Birth” (Normal/Caesarean) was considered as a binary response variable and age, height, weight of mother and sex, weight, length, shoulder length, head circumference of the baby were treated as explanatory variables. Logistic regression was used to model the data and using stepwise regression;mother’s age, height and weight were identified as the most influential variables on the type of birth. Further, it was observed that the odds of having normal birth is 80% higher for woman who belongs to the age group less than or equal to 30 years compared to woman who belongs to the age group greater than 30 years. Moreover, the Hosmer-Lemeshow goodness of fit test was used to check the adequacy of the fitted model. Results from this study revealed that in future, the type of birth may be predicted by considering these identified influential variables.
文摘The adjacent-categories, continuation-ratio and proportional odds logit-link regression models provide useful extensions of the multinomial logistic model to ordinal response data. We propose fitting these models with a logarithmic link to allow estimation of different forms of the risk ratio. Each of the resulting ordinal response log-link models is a constrained version of the log multinomial model, the log-link counterpart of the multinomial logistic model. These models can be estimated using software that allows the user to specify the log likelihood as the objective function to be maximized and to impose constraints on the parameter estimates. In example data with a dichotomous covariate, the unconstrained models produced valid coefficient estimates and standard errors, and the constrained models produced plausible results. Models with a single continuous covariate performed well in data simulations, with low bias and mean squared error on average and appropriate confidence interval coverage in admissible solutions. In an application to real data, practical aspects of the fitting of the models are investigated. We conclude that it is feasible to obtain adjusted estimates of the risk ratio for ordinal outcome data.
基金Supported by National Natural Science Foundation of China(No.50878140)Project of Water Pollution Control and Repair(No.2008ZX07317-005)
文摘This paper focuses on the quantitative expression of bacterial regrowth in water distribution system. Considering public health risks of bacterial regrowth,the experiment was performed on a distribution system of selected area.Physical,chemical,and microbiological parameters such as turbidity,temperature,residual chlorine and pH were measured over a three-month period and correlation analysis was carried out.Combined with principal components analysis(PCA) ,a logistic regression model is developed to predict and diagnose bacterial regrowth and locate the zones with high risks of microbiology in the distribution system.The model gives the probability of bacterial regrowth with the number of heterotrophic plate counts as the binary response variable and three new principal components variables as the explanatory variables.The veracity of the logistic regression model was 90%,which meets the precision requirement of the model.