Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/appr...Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/approach:The study is based on Monte Carlo simulations.The methods are compared in terms of three measures of accuracy:specificity and two kinds of sensitivity.A loss function combining sensitivity and specificity is introduced and used for a final comparison.Findings:The choice of method depends on how much the users emphasize sensitivity against specificity.It also depends on the sample size.For a typical logistic regression setting with a moderate sample size and a small to moderate effect size,either BIC,BICc or Lasso seems to be optimal.Research limitations:Numerical simulations cannot cover the whole range of data-generating processes occurring with real-world data.Thus,more simulations are needed.Practical implications:Researchers can refer to these results if they believe that their data-generating process is somewhat similar to some of the scenarios presented in this paper.Alternatively,they could run their own simulations and calculate the loss function.Originality/value:This is a systematic comparison of model choice algorithms and heuristics in context of logistic regression.The distinction between two types of sensitivity and a comparison based on a loss function are methodological novelties.展开更多
The use of non-timber is a valuable alternative for the conservation of tropical forests. Juçara (Euterpe edulis Mart.) is considered one of the main alternatives in the Atlantic Forest for the production of ...The use of non-timber is a valuable alternative for the conservation of tropical forests. Juçara (Euterpe edulis Mart.) is considered one of the main alternatives in the Atlantic Forest for the production of açaí pulp. However, there are few studies that aim to evaluate their production. The present study aimed to construct a probabilistic model to predict the production of Euterpe edulis bunches, using dendrometric variables and competition index. Twenty plots of 10 × 50 m were sampled in an area with said specie, showing the arboreal entities with diameter at breast height > 4.8 cm, and recording the Euterpe edulis phenomena. The main variables influencing the production of bunches were assessed using logistic regression model. The logistic regression showed the variables diameter breast height (DBH) and total height (h) as significant to explain the variation between productive and non-productive entities. The competition index tested was not significant (p-value = 0.221). The model of prediction of curl production in Juçara can be written as: Zi = -6.878594 + 0.2522454 × DBH + 0.1951574 × h. The use of a logistic regression model showed potential for prediction of non-timber forest products.展开更多
Modeling human blood components and disorders is a complicated task. Few researchers have attempted to automate the process of detecting anemia in human blood. These attempts have produced satisfactory but not highly ...Modeling human blood components and disorders is a complicated task. Few researchers have attempted to automate the process of detecting anemia in human blood. These attempts have produced satisfactory but not highly accurate results. In this paper, we present an efficient method to estimate hemoglobin value in human blood and detect anemia using microscopic color image data. We have developed a logit regression model using one thousand (1000) blood samples that were collected from Prince George Hospital laboratory. The output results of our model are compared with the results of the same sample set using CELL-DYN 3200 System in Prince George Hospital laboratory, and found to be near identical. These results exceed those reported in the literature. Moreover, the proposed method can be im-plemented in hardware with minimal circuitry and nominal cost.展开更多
文摘Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/approach:The study is based on Monte Carlo simulations.The methods are compared in terms of three measures of accuracy:specificity and two kinds of sensitivity.A loss function combining sensitivity and specificity is introduced and used for a final comparison.Findings:The choice of method depends on how much the users emphasize sensitivity against specificity.It also depends on the sample size.For a typical logistic regression setting with a moderate sample size and a small to moderate effect size,either BIC,BICc or Lasso seems to be optimal.Research limitations:Numerical simulations cannot cover the whole range of data-generating processes occurring with real-world data.Thus,more simulations are needed.Practical implications:Researchers can refer to these results if they believe that their data-generating process is somewhat similar to some of the scenarios presented in this paper.Alternatively,they could run their own simulations and calculate the loss function.Originality/value:This is a systematic comparison of model choice algorithms and heuristics in context of logistic regression.The distinction between two types of sensitivity and a comparison based on a loss function are methodological novelties.
文摘The use of non-timber is a valuable alternative for the conservation of tropical forests. Juçara (Euterpe edulis Mart.) is considered one of the main alternatives in the Atlantic Forest for the production of açaí pulp. However, there are few studies that aim to evaluate their production. The present study aimed to construct a probabilistic model to predict the production of Euterpe edulis bunches, using dendrometric variables and competition index. Twenty plots of 10 × 50 m were sampled in an area with said specie, showing the arboreal entities with diameter at breast height > 4.8 cm, and recording the Euterpe edulis phenomena. The main variables influencing the production of bunches were assessed using logistic regression model. The logistic regression showed the variables diameter breast height (DBH) and total height (h) as significant to explain the variation between productive and non-productive entities. The competition index tested was not significant (p-value = 0.221). The model of prediction of curl production in Juçara can be written as: Zi = -6.878594 + 0.2522454 × DBH + 0.1951574 × h. The use of a logistic regression model showed potential for prediction of non-timber forest products.
文摘Modeling human blood components and disorders is a complicated task. Few researchers have attempted to automate the process of detecting anemia in human blood. These attempts have produced satisfactory but not highly accurate results. In this paper, we present an efficient method to estimate hemoglobin value in human blood and detect anemia using microscopic color image data. We have developed a logit regression model using one thousand (1000) blood samples that were collected from Prince George Hospital laboratory. The output results of our model are compared with the results of the same sample set using CELL-DYN 3200 System in Prince George Hospital laboratory, and found to be near identical. These results exceed those reported in the literature. Moreover, the proposed method can be im-plemented in hardware with minimal circuitry and nominal cost.