Machine Learning(ML) techniques have been widely applied in recent traffic classification.However, the problems of both discriminator bias and class imbalance decrease the accuracies of ML based traffic classifier. In...Machine Learning(ML) techniques have been widely applied in recent traffic classification.However, the problems of both discriminator bias and class imbalance decrease the accuracies of ML based traffic classifier. In this paper, we propose an accurate and extensible traffic classifier. Specifically, to address the discriminator bias issue, our classifier is built by making an optimal cascade of binary sub-classifiers, where each binary sub-classifier is trained independently with the discriminators used for identifying application specific traffic. Moreover, to balance a training dataset,we apply SMOTE algorithm in generating artificial training samples for minority classes.We evaluate our classifier on two datasets collected from different network border routers.Compared with the previous multi-class traffic classifiers built in one-time training process,our classifier achieves much higher F-Measure and AUC for each application.展开更多
Cost-sensitive learning has been applied to resolve the multi-class imbalance problem in Internet traffic classification and it has achieved considerable results. But the classification performance on the minority cla...Cost-sensitive learning has been applied to resolve the multi-class imbalance problem in Internet traffic classification and it has achieved considerable results. But the classification performance on the minority classes with a few bytes is still unhopeful because the existing research only focuses on the classes with a large amount of bytes. Therefore, the class-dependent misclassification cost is studied. Firstly, the flow rate based cost matrix (FCM) is investigated. Secondly, a new cost matrix named weighted cost matrix (WCM) is proposed, which calculates a reasonable weight for each cost of FCM by regarding the data imbalance degree and classification accuracy of each class. It is able to further improve the classification performance on the difficult minority class (the class with more flows but worse classification accuracy). Experimental results on twelve real traffic datasets show that FCM and WCM obtain more than 92% flow g-mean and 80% byte g-mean on average; on the test set collected one year later, WCM outperforms FCM in terms of stability.展开更多
The problem of designing integration traffic strategies for traffic corridors with the use of ramp metering, speed limit, and route guidance is considered in this paper. As an improvement to the previous work, the pre...The problem of designing integration traffic strategies for traffic corridors with the use of ramp metering, speed limit, and route guidance is considered in this paper. As an improvement to the previous work, the presented approach has the following five features: 1) modeling traffic flow to analyze traffic characteristics under the influence of variable speed limit, on-ramp metering and guidance information; 2) building a hierarchy model to realize the integration design of traffic control and route guidance in traffic corridors; 3) devising a multi-class analytical dynamic traffic assignment (DTA) model for traffic corridors, where not only the route choice process will be different for each user-class, but also the traffic flow operations are user-class specific because the travel time characteristic for each user-class is considered; 4) predicting route choice probabilities adaptively with real-time traffic conditions and route choice behaviors corresponding to variant users, rather than assuming as pre-determined; and 5) suggesting a numerical solution algorithm of the hierarchy model presented in this paper based on the modified algorithm of iterative optimization assignment (IOA). Preliminary numerical test demonstrates the potential of the developed model and algorithm for integration corridor control.展开更多
绿色出行引导效果受外部信息及出行者选择偏好影响,需要考虑出行者潜在属性类别的异质性。自我呈现表现为人们通过控制与自己有关的信息来影响他人对自己的印象,体现了信息与选择偏好的交互作用。为定量分析出行者自我呈现意识和环保意...绿色出行引导效果受外部信息及出行者选择偏好影响,需要考虑出行者潜在属性类别的异质性。自我呈现表现为人们通过控制与自己有关的信息来影响他人对自己的印象,体现了信息与选择偏好的交互作用。为定量分析出行者自我呈现意识和环保意识对出行方式选择行为的影响,本文通过问卷调查收集到1382份有效样本,利用潜在类别模型(Latent Class Model,LCM)将出行者划分为高自我呈现组(18.23%)、中自我呈现组(20.26%)和低自我呈现组(61.51%)。离散选择模型的结果表明,出行者在进行方式选择时更关注出行时间和出行方式自身的特性,若不考虑方式特性,会高估出行费用对高自我呈现组的影响。高自我呈现组仅在短距离出行中对公共交通有强烈偏好,对于6~10 km的出行,选择私家车倾向明显。中短距离的出行中,低自我呈现组对骑行的偏好价值可以在一定程度上抵消过长出行时间带来的负效用。构建考虑出行者异质性的方式选择模型,可为政府和相关部门制定更协调、有针对性的交通调控政策及公共交通运营策略提供理论依据。展开更多
Detecting traffic anomalies is essential for diagnosing attacks. High-Speed Backbone Networks (HSBN) require Traffic Anomaly Detection Systems (TADS) which are accurate (high detection and low false positive rates) an...Detecting traffic anomalies is essential for diagnosing attacks. High-Speed Backbone Networks (HSBN) require Traffic Anomaly Detection Systems (TADS) which are accurate (high detection and low false positive rates) and efficient. The proposed approach utilizes entropy as traffic distributions metric over some traffic dimensions. An efficient algorithm, having low computational and space complexity, is used to estimate entropy. Entropy values over all dimensions are collected to form a detection vector for every sliding window. One class support vector machine classifies all detection vectors into one of two groups: abnormal vectors and normal vectors. A Multi-Windows Correlation Algorithm (MWCA) calculates comprehensive anomaly scores observed in a sequence of windows in order to reduce false positive rates and obtain high detection rates. Some real-world traffic traces have been used to validate and evaluate the efficiency and accuracy of this system through three experiments. In Experiment 1, the estimating algorithm of entropy which costs less memory and runs faster than traditional algorithms is more suitable for detection anomalies. In Experiment 2, the classification and correlation algorithms can improve the detection accuracy significantly. Experiment 3 compares the subject system and three well-known systems. Ours system is the most accurate one. Those results have indicated that the proposed system significantly improves the accuracy and efficiency.展开更多
基金supported by the National Natural Science Foundation of China under Grant No.61402485National Natural Science Foundation of China under Grant No.61303061supported by the Open fund from HPCL No.201513-01
文摘Machine Learning(ML) techniques have been widely applied in recent traffic classification.However, the problems of both discriminator bias and class imbalance decrease the accuracies of ML based traffic classifier. In this paper, we propose an accurate and extensible traffic classifier. Specifically, to address the discriminator bias issue, our classifier is built by making an optimal cascade of binary sub-classifiers, where each binary sub-classifier is trained independently with the discriminators used for identifying application specific traffic. Moreover, to balance a training dataset,we apply SMOTE algorithm in generating artificial training samples for minority classes.We evaluate our classifier on two datasets collected from different network border routers.Compared with the previous multi-class traffic classifiers built in one-time training process,our classifier achieves much higher F-Measure and AUC for each application.
基金supported by the National Basic Research Program of China(2007CB307100,2007CB307106)
文摘Cost-sensitive learning has been applied to resolve the multi-class imbalance problem in Internet traffic classification and it has achieved considerable results. But the classification performance on the minority classes with a few bytes is still unhopeful because the existing research only focuses on the classes with a large amount of bytes. Therefore, the class-dependent misclassification cost is studied. Firstly, the flow rate based cost matrix (FCM) is investigated. Secondly, a new cost matrix named weighted cost matrix (WCM) is proposed, which calculates a reasonable weight for each cost of FCM by regarding the data imbalance degree and classification accuracy of each class. It is able to further improve the classification performance on the difficult minority class (the class with more flows but worse classification accuracy). Experimental results on twelve real traffic datasets show that FCM and WCM obtain more than 92% flow g-mean and 80% byte g-mean on average; on the test set collected one year later, WCM outperforms FCM in terms of stability.
基金supported by the National Natural Science Foundation of China (No.50808025)the Ministry of Communications of China Application Foundation (No.2006319815080)+1 种基金the Key Project of Hunan Education Department (No.08A003)the Project of Hunan Science and Technology Department (No.2008GK3114)
文摘The problem of designing integration traffic strategies for traffic corridors with the use of ramp metering, speed limit, and route guidance is considered in this paper. As an improvement to the previous work, the presented approach has the following five features: 1) modeling traffic flow to analyze traffic characteristics under the influence of variable speed limit, on-ramp metering and guidance information; 2) building a hierarchy model to realize the integration design of traffic control and route guidance in traffic corridors; 3) devising a multi-class analytical dynamic traffic assignment (DTA) model for traffic corridors, where not only the route choice process will be different for each user-class, but also the traffic flow operations are user-class specific because the travel time characteristic for each user-class is considered; 4) predicting route choice probabilities adaptively with real-time traffic conditions and route choice behaviors corresponding to variant users, rather than assuming as pre-determined; and 5) suggesting a numerical solution algorithm of the hierarchy model presented in this paper based on the modified algorithm of iterative optimization assignment (IOA). Preliminary numerical test demonstrates the potential of the developed model and algorithm for integration corridor control.
文摘绿色出行引导效果受外部信息及出行者选择偏好影响,需要考虑出行者潜在属性类别的异质性。自我呈现表现为人们通过控制与自己有关的信息来影响他人对自己的印象,体现了信息与选择偏好的交互作用。为定量分析出行者自我呈现意识和环保意识对出行方式选择行为的影响,本文通过问卷调查收集到1382份有效样本,利用潜在类别模型(Latent Class Model,LCM)将出行者划分为高自我呈现组(18.23%)、中自我呈现组(20.26%)和低自我呈现组(61.51%)。离散选择模型的结果表明,出行者在进行方式选择时更关注出行时间和出行方式自身的特性,若不考虑方式特性,会高估出行费用对高自我呈现组的影响。高自我呈现组仅在短距离出行中对公共交通有强烈偏好,对于6~10 km的出行,选择私家车倾向明显。中短距离的出行中,低自我呈现组对骑行的偏好价值可以在一定程度上抵消过长出行时间带来的负效用。构建考虑出行者异质性的方式选择模型,可为政府和相关部门制定更协调、有针对性的交通调控政策及公共交通运营策略提供理论依据。
基金supported by the National High-Tech Research and Development Plan of China under Grant No.2011AA010702
文摘Detecting traffic anomalies is essential for diagnosing attacks. High-Speed Backbone Networks (HSBN) require Traffic Anomaly Detection Systems (TADS) which are accurate (high detection and low false positive rates) and efficient. The proposed approach utilizes entropy as traffic distributions metric over some traffic dimensions. An efficient algorithm, having low computational and space complexity, is used to estimate entropy. Entropy values over all dimensions are collected to form a detection vector for every sliding window. One class support vector machine classifies all detection vectors into one of two groups: abnormal vectors and normal vectors. A Multi-Windows Correlation Algorithm (MWCA) calculates comprehensive anomaly scores observed in a sequence of windows in order to reduce false positive rates and obtain high detection rates. Some real-world traffic traces have been used to validate and evaluate the efficiency and accuracy of this system through three experiments. In Experiment 1, the estimating algorithm of entropy which costs less memory and runs faster than traditional algorithms is more suitable for detection anomalies. In Experiment 2, the classification and correlation algorithms can improve the detection accuracy significantly. Experiment 3 compares the subject system and three well-known systems. Ours system is the most accurate one. Those results have indicated that the proposed system significantly improves the accuracy and efficiency.