Crash occurrence is a complex phenomenon,and crashes associated with pedestrians and bicyclists are even more complex.Furthermore,pedestrian-and bicyclist-involved crashes are typically not reported in detail in state...Crash occurrence is a complex phenomenon,and crashes associated with pedestrians and bicyclists are even more complex.Furthermore,pedestrian-and bicyclist-involved crashes are typically not reported in detail in state or national crash databases.To address this issue,developers created the Pedestrian and Bicycle Crash Analysis Tool(PBCAT).However,it is labour-intensive to manually identify the types of pedestrian and bicycle crash from crash-narrative reports and to classify different crash attributes from the textual content of police reports.Therefore,there is a need for a supporting tool that can assist practitioners in using PBCAT more efficiently and accurately.The objective of this study is to develop a framework for applying machine-learning models to classify crash types from unstructured textual content.In this study,the research team collected pedestrian crash-typing data from two locations in Texas.The XGBoost model was found to be the best classifier.The high prediction power of the XGBoost classifiers indicates that this machine-learning technique was able to classify pedestrian crash types with the highest accuracy rate(up to 77%for training data and 72%for test data).The findings demonstrate that advanced machine-learning models can extract underlying patterns and trends of crash mechanisms.This provides the basis for applying machine-learning techniques in addressing the crash typing issues associated with non-motorist crashes.展开更多
文摘Crash occurrence is a complex phenomenon,and crashes associated with pedestrians and bicyclists are even more complex.Furthermore,pedestrian-and bicyclist-involved crashes are typically not reported in detail in state or national crash databases.To address this issue,developers created the Pedestrian and Bicycle Crash Analysis Tool(PBCAT).However,it is labour-intensive to manually identify the types of pedestrian and bicycle crash from crash-narrative reports and to classify different crash attributes from the textual content of police reports.Therefore,there is a need for a supporting tool that can assist practitioners in using PBCAT more efficiently and accurately.The objective of this study is to develop a framework for applying machine-learning models to classify crash types from unstructured textual content.In this study,the research team collected pedestrian crash-typing data from two locations in Texas.The XGBoost model was found to be the best classifier.The high prediction power of the XGBoost classifiers indicates that this machine-learning technique was able to classify pedestrian crash types with the highest accuracy rate(up to 77%for training data and 72%for test data).The findings demonstrate that advanced machine-learning models can extract underlying patterns and trends of crash mechanisms.This provides the basis for applying machine-learning techniques in addressing the crash typing issues associated with non-motorist crashes.