Before-after study with the empirical Bayes(EB)method is the state-of-the-art approach for estimating crash modification factors(CMFs).The EB method not only addresses the regression-to-the-mean bias,but also improves...Before-after study with the empirical Bayes(EB)method is the state-of-the-art approach for estimating crash modification factors(CMFs).The EB method not only addresses the regression-to-the-mean bias,but also improves accuracy.However,the performance of the CMFs derived from the EB method has never been fully investigated.This study aims to examine the accuracy of CMFs estimated with the EB method.Artificial realistic data(ARD)and real crash data are used to evaluate the CMFs.The results indicate that:1)The CMFs derived from the EB before-after method are nearly the same as the true values.2)The estimated CMF standard errors do not reflect the true values.The estimation remains at the same level regardless of the pre-assumed CMF standard error.The EB before-after study is not sensitive to the variation of CMF among sites.3)The analyses with real-world traffic and crash data with a dummy treatment indicate that the EB method tends to underestimate the standard error of the CMF.Safety researchers should recognize that the CMF variance may be biased when evaluating safety effectiveness by the EB method.It is necessary to revisit the algorithm for estimating CMF variance with the EB method.展开更多
Literature review indicates that sample size, attribute variance and within-sample choice distribution of alternatives are important considerations in the estimation of multinomial logit (MNL) models, but their impa...Literature review indicates that sample size, attribute variance and within-sample choice distribution of alternatives are important considerations in the estimation of multinomial logit (MNL) models, but their impacts on the estimation accuracy have not been systematically studied. Therefore, the objective of this paper is to provide an empirical examination to the above issues through a set of simulated discrete choice preference and rank ordered preference datasets. In this paper, the utility coefficients, alternative specific constants (ASCs), and the mean and standard deviation of the four attributes for a set of seven hypothetical alternatives are specified as a priori. Then, synthetic datasets, with varying sample size, attribute variance and within-sample choice distribution are simulated. Based on these datasets, the utility coefficients and ASCs of the specified MNLs are re-estimated and compared with the original values specified as the priori. It is found that (1) the estimation accuracy of utility parameters increases as the sample size increases; (2) the utility coefficients can be re-estimated with reasonable accuracy, but the estimates of the ASCs are confronted with much larger errors; (3) as the variances of the alternative attributes increase, the estimation accuracy improves significantly; and (4) as the distribution of chosen choices becomes more balanced across alternatives within sample datasets, the hit-ratio decreases. The results indicate that (a) under a similar setting presented in this paper, a large sample consisting of a few thousand observations (3000 - 4000) may be needed in order to provide reasonable estimates for utility coefficients, particularly for ASCs; (b) a larger, but realistic attribute space is preferred in the stated preference survey design; and (c) choice datasets with unbalanced "chosen" choice frequency distribution is preferred, in order to better capture the elasticity between the "perceived utility" associated with alternative's attributes.展开更多
基金Project(51978082)supported by the National Natural Science Foundation of ChinaProject(19B022)supported by the Outstanding Youth Foundation of Hunan Education Department,ChinaProject(2019QJCZ056)supported by the Young Teacher Development Foundation of Changsha University of Science&Technology,China。
文摘Before-after study with the empirical Bayes(EB)method is the state-of-the-art approach for estimating crash modification factors(CMFs).The EB method not only addresses the regression-to-the-mean bias,but also improves accuracy.However,the performance of the CMFs derived from the EB method has never been fully investigated.This study aims to examine the accuracy of CMFs estimated with the EB method.Artificial realistic data(ARD)and real crash data are used to evaluate the CMFs.The results indicate that:1)The CMFs derived from the EB before-after method are nearly the same as the true values.2)The estimated CMF standard errors do not reflect the true values.The estimation remains at the same level regardless of the pre-assumed CMF standard error.The EB before-after study is not sensitive to the variation of CMF among sites.3)The analyses with real-world traffic and crash data with a dummy treatment indicate that the EB method tends to underestimate the standard error of the CMF.Safety researchers should recognize that the CMF variance may be biased when evaluating safety effectiveness by the EB method.It is necessary to revisit the algorithm for estimating CMF variance with the EB method.
文摘Literature review indicates that sample size, attribute variance and within-sample choice distribution of alternatives are important considerations in the estimation of multinomial logit (MNL) models, but their impacts on the estimation accuracy have not been systematically studied. Therefore, the objective of this paper is to provide an empirical examination to the above issues through a set of simulated discrete choice preference and rank ordered preference datasets. In this paper, the utility coefficients, alternative specific constants (ASCs), and the mean and standard deviation of the four attributes for a set of seven hypothetical alternatives are specified as a priori. Then, synthetic datasets, with varying sample size, attribute variance and within-sample choice distribution are simulated. Based on these datasets, the utility coefficients and ASCs of the specified MNLs are re-estimated and compared with the original values specified as the priori. It is found that (1) the estimation accuracy of utility parameters increases as the sample size increases; (2) the utility coefficients can be re-estimated with reasonable accuracy, but the estimates of the ASCs are confronted with much larger errors; (3) as the variances of the alternative attributes increase, the estimation accuracy improves significantly; and (4) as the distribution of chosen choices becomes more balanced across alternatives within sample datasets, the hit-ratio decreases. The results indicate that (a) under a similar setting presented in this paper, a large sample consisting of a few thousand observations (3000 - 4000) may be needed in order to provide reasonable estimates for utility coefficients, particularly for ASCs; (b) a larger, but realistic attribute space is preferred in the stated preference survey design; and (c) choice datasets with unbalanced "chosen" choice frequency distribution is preferred, in order to better capture the elasticity between the "perceived utility" associated with alternative's attributes.