In this paper, we present a framework to validate and to correct the location information in crash reports. The proposed system is composed of several modules: creation of intersection network and pre-processing, vali...In this paper, we present a framework to validate and to correct the location information in crash reports. The proposed system is composed of several modules: creation of intersection network and pre-processing, validation, snapping, updating, and manual-spotting.The proposed system utilizes GPS coordinates(latitude/longitude), primary and reference road names, primary direction, and distance to the reference road. The algorithm starts at an initial GPS coordinate provided in the crash report. Additional location information is utilized in case this coordinate does not satisfy certain criteria. After verification of the road names, the correct snapping location is determined using the primary direction and primary distance information. At the final stage, the attributes in the crash data are retrieved from the base map given the snapping point. We tested the proposed system by three different test data sets. The number of correctly identified matches and the average snapping error are calculated for assessment. The experiments show that the proposed framework is capable of geocoding the crashes up to 98% accuracy with the average of 37 feet of snapping error and 64% of the matching records have less than snapping error of 1 foot. However, we observed that considering highly specific cases as an outlier, the snapping error significantly drops. We discuss these cases in detail and propose a remedy as a future work.展开更多
文摘In this paper, we present a framework to validate and to correct the location information in crash reports. The proposed system is composed of several modules: creation of intersection network and pre-processing, validation, snapping, updating, and manual-spotting.The proposed system utilizes GPS coordinates(latitude/longitude), primary and reference road names, primary direction, and distance to the reference road. The algorithm starts at an initial GPS coordinate provided in the crash report. Additional location information is utilized in case this coordinate does not satisfy certain criteria. After verification of the road names, the correct snapping location is determined using the primary direction and primary distance information. At the final stage, the attributes in the crash data are retrieved from the base map given the snapping point. We tested the proposed system by three different test data sets. The number of correctly identified matches and the average snapping error are calculated for assessment. The experiments show that the proposed framework is capable of geocoding the crashes up to 98% accuracy with the average of 37 feet of snapping error and 64% of the matching records have less than snapping error of 1 foot. However, we observed that considering highly specific cases as an outlier, the snapping error significantly drops. We discuss these cases in detail and propose a remedy as a future work.