In this paper, sixty-eight research articles published between 2000 and 2017 as well as textbooks which employed four classification algorithms: K-Nearest-Neighbor (KNN), Support Vector Machines (SVM), Random Forest (...In this paper, sixty-eight research articles published between 2000 and 2017 as well as textbooks which employed four classification algorithms: K-Nearest-Neighbor (KNN), Support Vector Machines (SVM), Random Forest (RF) and Neural Network (NN) as the main statistical tools were reviewed. The aim was to examine and compare these nonparametric classification methods on the following attributes: robustness to training data, sensitivity to changes, data fitting, stability, ability to handle large data sizes, sensitivity to noise, time invested in parameter tuning, and accuracy. The performances, strengths and shortcomings of each of the algorithms were examined, and finally, a conclusion was arrived at on which one has higher performance. It was evident from the literature reviewed that RF is too sensitive to small changes in the training dataset and is occasionally unstable and tends to overfit in the model. KNN is easy to implement and understand but has a major drawback of becoming significantly slow as the size of the data in use grows, while the ideal value of K for the KNN classifier is difficult to set. SVM and RF are insensitive to noise or overtraining, which shows their ability in dealing with unbalanced data. Larger input datasets will lengthen classification times for NN and KNN more than for SVM and RF. Among these nonparametric classification methods, NN has the potential to become a more widely used classification algorithm, but because of their time-consuming parameter tuning procedure, high level of complexity in computational processing, the numerous types of NN architectures to choose from and the high number of algorithms used for training, most researchers recommend SVM and RF as easier and wieldy used methods which repeatedly achieve results with high accuracies and are often faster to implement.展开更多
This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on m...This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on medical research. Thirty seven research articles published between 2000 and 2018 which employed logistic regression as the main statistical tool as well as six text books on logistic regression were reviewed. Logistic regression concepts such as odds, odds ratio, logit transformation, logistic curve, assumption, selecting dependent and independent variables, model fitting, reporting and interpreting were presented. Upon perusing the literature, considerable deficiencies were found in both the use and reporting of LR. For many studies, the ratio of the number of outcome events to predictor variables (events per variable) was sufficiently small to call into question the accuracy of the regression model. Also, most studies did not report on validation analysis, regression diagnostics or goodness-of-fit measures;measures which authenticate the robustness of the LR model. Here, we demonstrate a good example of the application of the LR model using data obtained on a cohort of pregnant women and the factors that influence their decision to opt for caesarean delivery or vaginal birth. It is recommended that researchers should be more rigorous and pay greater attention to guidelines concerning the use and reporting of LR models.展开更多
This study examined the non-medical factors that influence expectant mothers to opt for caesarean deliveries in Ghana. Data on 395 expectant mothers across the ten regions of Ghana who were located in urban, semi-rura...This study examined the non-medical factors that influence expectant mothers to opt for caesarean deliveries in Ghana. Data on 395 expectant mothers across the ten regions of Ghana who were located in urban, semi-rural and rural areas, and spanned a period of five years (from 2012 to 2016) were obtained from the Ghana Health Service. In fitting the logistic regression model, data on 355 expectant mothers (i.e. 89.9% of the data) was assigned to the analysis sample while 40 (i.e. 10.1%) was assigned to the hold-out sample. The hold-out sample together with other statistical measures of overall model fit, pseudo R2 measures and classification accuracy were used to validate the results obtained from the analysis sample. Significance was tested at p = 0.05. Determinants including, educational level of expectant mother, parity of expectant mother, baby’s birth weight, previous caesarean delivery, location of expectant mother, age of expectant mother and, period within the year of childbirth had a significant effect on caesarean delivery. The study recommended that health practitioners should be able to foretell expectant mothers who are likely to undergo caesarean delivery in order for them to prepare financially and psychologically to avoid further complications. Due to the significant positive attitude of women towards caesarean delivery rather than normal delivery, it is necessary to inform them about the advantages of normal delivery and the health hazards associated with caesarean delivery to the mother and child.展开更多
Hepatitis B virus (HBV) infection remains a global health problem. With about 380 million chronic carriers of the HBV virus, there are over two million global deaths annually. Ghana is among the high endemic countries...Hepatitis B virus (HBV) infection remains a global health problem. With about 380 million chronic carriers of the HBV virus, there are over two million global deaths annually. Ghana is among the high endemic countries in Africa, with HBV prevalence ranging from 4.8% to 12.3% in the general population, 10.8% to 12.7% in blood donors and about 10.6% in antenatal clinic (ANC) attendees. The main objectives of this study were to test how socioeconomic factors, risky behaviors, knowledge and awareness of HBV infection correlate with actual HBV status among antenatal clinic attendees and to determine the predictors of HBV testing among ANC attendees. The study employed random sampling technique to sample 500 pregnant women, at mothers’ clinic of Volta Regional Hospital, Ho, Ghana. A structured questionnaire was used to collect information on socio-demographic characteristics, Hepatitis B status, possible risk factors, awareness and knowledge levels of HBV infection. Cross tabulation and the chi-square (χ2) statistic were used to determine statistical independence or association of study variables. Kruskal-Wallis test was applied to test for the differences in HBV knowledge scores across HBV status and levels of HBV awareness;and the binomial regression model was used to determine the predictors of HBV testing among ANC attendees. It is evident that age, religion, ethnicity, educational level, blood transfusion, number of blood transfusions, gravidity, parity, awareness of HBV and monthly income were associated with HBV status. Results of the Binomial Logistic Regression model indicate that Age (p = 0.03), Education level (p = 0.04), Religion (p = 0.04), Ethnicity (p = 0.00) and Blood transfusion (p = 0.04) were significant (p 0.05) predictors of HBV testing. Knowledge of HBV status enables patients to seek early treatment, facilitates referral for social support and counseling. We recommend that the Ministry of Health should carry effective education on HBV and its prevention for women of child-bearing age.展开更多
文摘In this paper, sixty-eight research articles published between 2000 and 2017 as well as textbooks which employed four classification algorithms: K-Nearest-Neighbor (KNN), Support Vector Machines (SVM), Random Forest (RF) and Neural Network (NN) as the main statistical tools were reviewed. The aim was to examine and compare these nonparametric classification methods on the following attributes: robustness to training data, sensitivity to changes, data fitting, stability, ability to handle large data sizes, sensitivity to noise, time invested in parameter tuning, and accuracy. The performances, strengths and shortcomings of each of the algorithms were examined, and finally, a conclusion was arrived at on which one has higher performance. It was evident from the literature reviewed that RF is too sensitive to small changes in the training dataset and is occasionally unstable and tends to overfit in the model. KNN is easy to implement and understand but has a major drawback of becoming significantly slow as the size of the data in use grows, while the ideal value of K for the KNN classifier is difficult to set. SVM and RF are insensitive to noise or overtraining, which shows their ability in dealing with unbalanced data. Larger input datasets will lengthen classification times for NN and KNN more than for SVM and RF. Among these nonparametric classification methods, NN has the potential to become a more widely used classification algorithm, but because of their time-consuming parameter tuning procedure, high level of complexity in computational processing, the numerous types of NN architectures to choose from and the high number of algorithms used for training, most researchers recommend SVM and RF as easier and wieldy used methods which repeatedly achieve results with high accuracies and are often faster to implement.
文摘This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on medical research. Thirty seven research articles published between 2000 and 2018 which employed logistic regression as the main statistical tool as well as six text books on logistic regression were reviewed. Logistic regression concepts such as odds, odds ratio, logit transformation, logistic curve, assumption, selecting dependent and independent variables, model fitting, reporting and interpreting were presented. Upon perusing the literature, considerable deficiencies were found in both the use and reporting of LR. For many studies, the ratio of the number of outcome events to predictor variables (events per variable) was sufficiently small to call into question the accuracy of the regression model. Also, most studies did not report on validation analysis, regression diagnostics or goodness-of-fit measures;measures which authenticate the robustness of the LR model. Here, we demonstrate a good example of the application of the LR model using data obtained on a cohort of pregnant women and the factors that influence their decision to opt for caesarean delivery or vaginal birth. It is recommended that researchers should be more rigorous and pay greater attention to guidelines concerning the use and reporting of LR models.
文摘This study examined the non-medical factors that influence expectant mothers to opt for caesarean deliveries in Ghana. Data on 395 expectant mothers across the ten regions of Ghana who were located in urban, semi-rural and rural areas, and spanned a period of five years (from 2012 to 2016) were obtained from the Ghana Health Service. In fitting the logistic regression model, data on 355 expectant mothers (i.e. 89.9% of the data) was assigned to the analysis sample while 40 (i.e. 10.1%) was assigned to the hold-out sample. The hold-out sample together with other statistical measures of overall model fit, pseudo R2 measures and classification accuracy were used to validate the results obtained from the analysis sample. Significance was tested at p = 0.05. Determinants including, educational level of expectant mother, parity of expectant mother, baby’s birth weight, previous caesarean delivery, location of expectant mother, age of expectant mother and, period within the year of childbirth had a significant effect on caesarean delivery. The study recommended that health practitioners should be able to foretell expectant mothers who are likely to undergo caesarean delivery in order for them to prepare financially and psychologically to avoid further complications. Due to the significant positive attitude of women towards caesarean delivery rather than normal delivery, it is necessary to inform them about the advantages of normal delivery and the health hazards associated with caesarean delivery to the mother and child.
文摘Hepatitis B virus (HBV) infection remains a global health problem. With about 380 million chronic carriers of the HBV virus, there are over two million global deaths annually. Ghana is among the high endemic countries in Africa, with HBV prevalence ranging from 4.8% to 12.3% in the general population, 10.8% to 12.7% in blood donors and about 10.6% in antenatal clinic (ANC) attendees. The main objectives of this study were to test how socioeconomic factors, risky behaviors, knowledge and awareness of HBV infection correlate with actual HBV status among antenatal clinic attendees and to determine the predictors of HBV testing among ANC attendees. The study employed random sampling technique to sample 500 pregnant women, at mothers’ clinic of Volta Regional Hospital, Ho, Ghana. A structured questionnaire was used to collect information on socio-demographic characteristics, Hepatitis B status, possible risk factors, awareness and knowledge levels of HBV infection. Cross tabulation and the chi-square (χ2) statistic were used to determine statistical independence or association of study variables. Kruskal-Wallis test was applied to test for the differences in HBV knowledge scores across HBV status and levels of HBV awareness;and the binomial regression model was used to determine the predictors of HBV testing among ANC attendees. It is evident that age, religion, ethnicity, educational level, blood transfusion, number of blood transfusions, gravidity, parity, awareness of HBV and monthly income were associated with HBV status. Results of the Binomial Logistic Regression model indicate that Age (p = 0.03), Education level (p = 0.04), Religion (p = 0.04), Ethnicity (p = 0.00) and Blood transfusion (p = 0.04) were significant (p 0.05) predictors of HBV testing. Knowledge of HBV status enables patients to seek early treatment, facilitates referral for social support and counseling. We recommend that the Ministry of Health should carry effective education on HBV and its prevention for women of child-bearing age.