摘要
从海量出租车GPS轨迹数据中挖掘和分析城市出租车乘客的出行特征,可以为城市交通管理者和出租车行业管理者在城市交通规划与管理、城市交通流均衡与车辆调度等方面提供决策依据.基于Spark大数据处理分析平台,选择YARN作为资源管理调度系统,采用HDFS分布式存储系统,对出租车GPS轨迹数据进行挖掘.给出了基于Spark平台的出租车乘客出行特征的挖掘方法,包括出租车乘客出行距离分布、出租车使用时间分布及出租车出行需求.实验结果表明,基于Spark平台分析方法能够快速且准确的分析出出租车乘客出行特征.
By mining the effective information hiding in the massive GPS track data of taxi, it can analyze the characteristics of taxi passengers, the urban traffic manager and the taxi industry manager can make decisions in urban transportation planning, urban traffic flow equilibrium and vehicle scheduling. Based on Spark big data analysis platform, YARN as resource management is chosen and HDFS distributed storage system for taxi GPS trace is used for data mining, a variety of information related to taxi is extracted. The mining algorithm based on Spark platform is given, which includes the distance distribution of taxi passengers travel, the time distribution of taxi usage and the demand of taxi travel. The experimental results show that the proposed method based on Spark platform can quickly and accurately analyze the characteristics of taxi passengers travel.
作者
段宗涛
陈志明
陈柘
康军
DUAN Zong-Tao CHEN Zhi-Ming CHEN Zhe KANG Jun(School of Information Engineering, Chang'an University, Xi' an 710064, China Shaanxi Road and Traffic Detection and Technical Research Center, Xi'an 710064, China)
出处
《计算机系统应用》
2017年第3期37-43,共7页
Computer Systems & Applications
基金
国家自然科学基金(61303041)
交通运输部基础研究项目(2014319812150)
陕西省工业攻关项目(2014K05-28
2015GY002
2016GY-078)
中央高校创新团队项目(310824153405)
中央高校基础研究项目(310824161012)
人社部留学人员科技活动项目择优资助(2015192)