摘要
交通流数据作为交通系统重要的输入变量,交通流数据采集器所采集的交通流数据质量直接影响着交通系统运行的稳定性;针对交通流数据中所出现的采样数据异常情况,提出一种基于随机森林的缺失值填补及孤立森林算法的交通流异常数据筛选方法,在此基础上通过线性回归模型对数据中缺失值及异常值进行填补构建整体交通流数据有效性处理框架;结果表明经过异常值处理模型修复的数据整体满足有效交通流数据要求,可为交通情况预测及交通系统运行提供数据支撑。
Traffic flow data as an important input variable of the traffic system,the quality of the traffic flow data collected by the traffic flow data collector directly affects the stability of the traffic system operation;a kind of abnormal situation of the sampling data in the traffic flow data is proposed.Based on random forest,missing value filling and lost forest algorithm traffic flow anomaly data screening method,a linear regression model is constructed to fill in the overall traffic flow data effectiveness of missing values and abnormal values in the data Framework;The results show that the data repaired by the outlier processing model as a whole meet the requirements of effective traffic flow data,and can provide data support for traffic situation prediction and traffic system operation.
作者
周博
贾树林
胡江宇
马双宝
罗维平
ZHOU Bo;JIA Shu-lin;HU Jiang-yu;MA Shuang-bao;LUO Wei-ping(School of Mechanical Engineering and Automation,Wuhan Textile University,Wuhan Hubei 430200,China;Hubei Provincial Key Laboratory of Digital Equipment,Wuhan Hubei 430200,China)
出处
《武汉纺织大学学报》
2021年第2期9-14,共6页
Journal of Wuhan Textile University
关键词
交通流数据
数据清洗
孤立森林
线性回归
异常值处理
traffic flow data
data cleaning
isolated forest
linear regression
outlier processing