摘要
目前交通数据存在信息孤岛问题,基础数据不公开,科研人员一般通过现场实测的方式获取。为了方便研究人员采集数据以及扩大样本量,论文给出了一种Web数据获取方法。交通事故与道路线形相关性研究需要分别获取事故点文字信息和相关道路线形空间数据,然后整合。采用Deep Web数据采集方法,获取交通事故点文字描述。针对国内地图中文语义识别较好但坐标加密,国外开源平台中文语义识别较弱但数据公开的特点,给出了将两者优点相结合的方法,通过建立国内地图和国外开源平台坐标映射关系,调用地图和平台接口,获取事故点相关道路数据。根据自动机理论,建立了状态可选的自动机模型,便于从异构的事故点相关数据源中提取道路线形数据。通过获取北京市交通事故Web数据,验证了方法的正确性和实用性,道路线形拟合结果与实际基本一致,符合线形研究的基本要求。
At present, the researchers always collect traffic data through sensors by themselves because of the data island and privacy. In order to facilitate researchers to collect data and expand the sample size, this paper presents a Web data acquisition method. The relativity research of traffic accident and road alignment needs to acquire spacial data of both text and geographic information of road alignment, and then integrate them. The text information was collected from Deep Web. Because the Chinese map semantic recognition is better but the coordinates are encrypted, the Chinese semantic recognition of foreign open-source platform is weak but the data is open, a method combining their advantages was given to extract road alignment information. First, it got the road data by setting up the reflection between domestic map and foreign open source platform, call the map and platform interface. Then, a model of robot with optional state was established to extract alignment information from heterogeneous road data. The correctness and practicability of the methods were verified by traffic accident Web data in Beijing. The fitting results of road alignment are basically consistent with the actual situation, and conform to the basic requirements of the research.
出处
《应用科技》
CAS
2017年第6期36-40,共5页
Applied Science and Technology
基金
国家自然科学基金项目(51278058)
关键词
数据处理
Web数据获取
交通事故
道路线形
语义识别
异构数据
开源平台
自动机模型
data processing
Web data collection
traffic accident
road alignment
semantic recognition
heterogeneousdata
open-source platform
automaton model