A semi-supervised vector machine is a relatively new learning method using both labeled and unlabeled data in classifi- cation. Since the objective function of the model for an unstrained semi-supervised vector machin...A semi-supervised vector machine is a relatively new learning method using both labeled and unlabeled data in classifi- cation. Since the objective function of the model for an unstrained semi-supervised vector machine is not smooth, many fast opti- mization algorithms cannot be applied to solve the model. In order to overcome the difficulty of dealing with non-smooth objective functions, new methods that can solve the semi-supervised vector machine with desired classification accuracy are in great demand. A quintic spline function with three-times differentiability at the ori- gin is constructed by a general three-moment method, which can be used to approximate the symmetric hinge loss function. The approximate accuracy of the quintic spiine function is estimated. Moreover, a quintic spline smooth semi-support vector machine is obtained and the convergence accuracy of the smooth model to the non-smooth one is analyzed. Three experiments are performed to test the efficiency of the model. The experimental results show that the new model outperforms other smooth models, in terms of classification performance. Furthermore, the new model is not sensitive to the increasing number of the labeled samples, which means that the new model is more efficient.展开更多
Maintenance operations have a critical influence on power gen-eration by wind turbines(WT).Advanced algorithms must analyze large volume of data from condition monitoring systems(CMS)to determine the actual working co...Maintenance operations have a critical influence on power gen-eration by wind turbines(WT).Advanced algorithms must analyze large volume of data from condition monitoring systems(CMS)to determine the actual working conditions and avoid false alarms.This paper proposes different support vector machine(SVM)algorithms for the prediction and detection of false alarms.K-Fold cross-validation(CV)is applied to evaluate the classification reliability of these algorithms.Supervisory Control and Data Acquisition(SCADA)data from an operating WT are applied to test the proposed approach.The results from the quadratic SVM showed an accuracy rate of 98.6%.Misclassifications from the confusion matrix,alarm log and maintenance records are analyzed to obtain quantitative information and determine if it is a false alarm.The classifier reduces the number of false alarms called misclassifications by 25%.These results demonstrate that the proposed approach presents high reliability and accuracy in false alarm identification.展开更多
Engine spark ignition is an important source for diagnosis of engine faults.Based on the waveform of the ignition pattern,a mechanic can guess what may be the potential malfunctioning parts of an engine with his/her e...Engine spark ignition is an important source for diagnosis of engine faults.Based on the waveform of the ignition pattern,a mechanic can guess what may be the potential malfunctioning parts of an engine with his/her experience and handbooks.However,this manual diagnostic method is imprecise because many spark ignition patterns are very similar.Therefore,a diagnosis needs many trials to identify the malfunctioning parts.Meanwhile the mechanic needs to disassemble and assemble the engine parts for verification.To tackle this problem,an intelligent diagnosis system was established based on ignition patterns.First,the captured patterns were normalized and compressed.Then wavelet packet transform(WPT) was employed to extract the representative features of the ignition patterns.Finally,a classification system was constructed by using multi-class support vector machines(SVM) and the extracted features.The classification system can intelligently classify the most likely engine fault so as to reduce the number of diagnosis trials.Experimental results show that SVM produces higher diagnosis accuracy than the traditional multilayer feedforward neural network.This is the first trial on the combination of WPT and SVM to analyze ignition patterns and diagnose automotive engines.展开更多
The paper is related to the error analysis of Multicategory Support Vector Machine (MSVM) classifiers based on reproducing kernel Hilbert spaces. We choose the polynomial kernel as Mercer kernel and give the error e...The paper is related to the error analysis of Multicategory Support Vector Machine (MSVM) classifiers based on reproducing kernel Hilbert spaces. We choose the polynomial kernel as Mercer kernel and give the error estimate with De La Vall6e Poussin means. We also introduce the standard estimation of sample error, and derive the explicit learning rate.展开更多
A comprehensive assessment of the spatial-aware supervised learning algorithms for hyper-spectral image(HSI)classification was presented.For this purpose,standard support vector machines(SVMs),multinomial logistic reg...A comprehensive assessment of the spatial-aware supervised learning algorithms for hyper-spectral image(HSI)classification was presented.For this purpose,standard support vector machines(SVMs),multinomial logistic regression(MLR)and sparse representation(SR) based supervised learning algorithm were compared both theoretically and experimentally.Performance of the discussed techniques was evaluated in terms of overall accuracy,average accuracy,kappa statistic coefficients,and sparsity of the solutions.Execution time,the computational burden,and the capability of the methods were investigated by using probabilistic analysis.For validating the accuracy a classical benchmark AVIRIS Indian pines data set was used.Experiments show that integrating spectral-spatial context can further improve the accuracy,reduce the misclassification error although the cost of computational time will be increased.展开更多
There are various heterogeneous networks for terminals to deliver a better quality of service. Signal system recognition and classification contribute a lot to the process. However, in low signal to noise ratio(SNR)...There are various heterogeneous networks for terminals to deliver a better quality of service. Signal system recognition and classification contribute a lot to the process. However, in low signal to noise ratio(SNR) circumstances or under time-varying multipath channels, the majority of the existing algorithms for signal recognition are already facing limitations. In this series, we present a robust signal recognition method based upon the original and latest updated version of the extreme learning machine(ELM) to help users to switch between networks. The ELM utilizes signal characteristics to distinguish systems. The superiority of this algorithm lies in the random choices of hidden nodes and in the fact that it determines the output weights analytically, which result in lower complexity. Theoretically, the algorithm tends to offer a good generalization performance at an extremely fast speed of learning. Moreover, we implement the GSM/WCDMA/LTE models in the Matlab environment by using the Simulink tools. The simulations reveal that the signals can be recognized successfully to achieve a 95% accuracy in a low SNR(0 dB) environment in the time-varying multipath Rayleigh fading channel.展开更多
Machine learning and artificial intelligence approaches have rapidly gained popularity for use in many subsurface energy applications.They are seen as novel methods that may enhance existing capabilities,providing for...Machine learning and artificial intelligence approaches have rapidly gained popularity for use in many subsurface energy applications.They are seen as novel methods that may enhance existing capabilities,providing for improved efficiency in exploration and production operations.Furthermore,their inte-gration into reservoir management workflows may shape the future landscape of the energy industry.This study implements a framework that generates predictive models using multiple machine learning classification-based algorithms which can identify specific stratigraphic units(i.e.,formations)as a function of total vertical depth and spatial positioning.The framework is applied in a case study to 13 specific formations of interest(Upper Spraberry through Atoka/Morrow reservoirs)in the Midland Basin,West Texas,United States;a prominent hydrocarbon producing sub-basin of the larger Permian Basin.The study dataset consists of over 275,000 records and includes data fields like formation iden-tifier,true vertical depth(in feet)of formations observed,and latitude and longitude coordinates(in decimal degrees).A subset of 134,374 data records were relevant to the 13 distinct formations of interest and were extracted and used for machine learning model training,validation,and testing.Four super-vised learning approaches including random forest(RF),gradient boosting(GB),support vector machine(SVM),and multilayer perceptron neural network(MLP)were evaluated and their prediction accuracy compared.The best performing model was ultimately built on the RF algorithm and is capable of an overall prediction accuracy of 93 percent on holdout data.The RF-based model demonstrated high prediction accuracy for major oil and gas producing zones including the San Andres,Upper Spraberry,Lower Spraberry,Clearfork,and Wolfcamp at 98,94,89,94,and 94 percent respectively.Overall,the resulting data-driven model provides a robust,cost-effective approach which can complement contemporary reservoir management approaches for multiple subsurface energy applications.展开更多
Advances in the technology of astronomical spectra acquisition have resulted in an enormous amount of data available in world-wide telescope archives. It is no longer feasible to analyze them using classical approache...Advances in the technology of astronomical spectra acquisition have resulted in an enormous amount of data available in world-wide telescope archives. It is no longer feasible to analyze them using classical approaches, so a new astronomical discipline,astroinformatics, has emerged. We describe the initial experiments in the investigation of spectral line profiles of emission line stars using machine learning with attempt to automatically identify Be and B[e] stars spectra in large archives and classify their types in an automatic manner. Due to the size of spectra collections, the dimension reduction techniques based on wavelet transformation are studied as well. The result clearly justifies that machine learning is able to distinguish different shapes of line profiles even after drastic dimension reduction.展开更多
Support vector machines and a Kalman-like observer are used for fault detection and isolation in a variable speed horizontalaxis wind turbine composed of three blades and a full converter. The support vector approach ...Support vector machines and a Kalman-like observer are used for fault detection and isolation in a variable speed horizontalaxis wind turbine composed of three blades and a full converter. The support vector approach is data-based and is therefore robust to process knowledge. It is based on structural risk minimization which enhances generalization even with small training data set and it allows for process nonlinearity by using flexible kernels. In this work, a radial basis function is used as the kernel. Different parts of the process are investigated including actuators and sensors faults. With duplicated sensors, sensor faults in blade pitch positions,generator and rotor speeds can be detected. Faults of type stuck measurements can be detected in 2 sampling periods. The detection time of offset/scaled measurements depends on the severity of the fault and on the process dynamics when the fault occurs. The converter torque actuator fault can be detected within 2 sampling periods. Faults in the actuators of the pitch systems represents a higher difficulty for fault detection which is due to the fact that such faults only affect the transitory state(which is very fast) but not the final stationary state. Therefore, two methods are considered and compared for fault detection and isolation of this fault: support vector machines and a Kalman-like observer. Advantages and disadvantages of each method are discussed. On one hand, support vector machines training of transitory states would require a big amount of data in different situations, but the fault detection and isolation results are robust to variations in the input/operating point. On the other hand, the observer is model-based, and therefore does not require training, and it allows identification of the fault level, which is interesting for fault reconfiguration. But the observability of the system is ensured under specific conditions, related to the dynamics of the inputs and outputs. The whole fault detection and isolation scheme is evaluated using a wind turbine benchmark with a real sequence of wind speed.展开更多
The Extreme Learning Machine(ELM) is an effective learning algorithm for a Single-Layer Feedforward Network(SLFN). It performs well in managing some problems due to its fast learning speed. However, in practical a...The Extreme Learning Machine(ELM) is an effective learning algorithm for a Single-Layer Feedforward Network(SLFN). It performs well in managing some problems due to its fast learning speed. However, in practical applications, its performance might be affected by the noise in the training data. To tackle the noise issue, we propose a novel heterogeneous ensemble of ELMs in this article. Specifically, the correntropy is used to achieve insensitive performance to outliers, while implementing Negative Correlation Learning(NCL) to enhance diversity among the ensemble. The proposed Heterogeneous Ensemble of ELMs(HE2 LM) for classification has different ELM algorithms including the Regularized ELM(RELM), the Kernel ELM(KELM), and the L2-norm-optimized ELM(ELML2). The ensemble is constructed by training a randomly selected ELM classifier on a subset of the training data selected through random resampling. Then, the class label of unseen data is predicted using a maximum weighted sum approach. After splitting the training data into subsets, the proposed HE2 LM is tested through classification and regression tasks on real-world benchmark datasets and synthetic datasets. Hence, the simulation results show that compared with other algorithms, our proposed method can achieve higher prediction accuracy, better generalization, and less sensitivity to outliers.展开更多
Network traffic anomalies are unusual changes in a network,so diagnosing anomalies is important for network management.Feature-based anomaly detection models (ab)normal network traffic behavior by analyzing packet h...Network traffic anomalies are unusual changes in a network,so diagnosing anomalies is important for network management.Feature-based anomaly detection models (ab)normal network traffic behavior by analyzing packet header features.PCA-subspace method (Principal Component Analysis) has been verified as an efficient feature-based way in network-wide anomaly detection.Despite the powerful ability of PCA-subspace method for network-wide traffic detection,it cannot be effectively used for detection on a single link.In this paper,different from most works focusing on detection on flow-level traffic,based on observations of six traffic features for packet-level traffic,we propose a new approach B6SVM to detect anomalies for packet-level traffic on a single link.The basic idea of B6-SVM is to diagnose anomalies in a multi-dimensional view of traffic features using Support Vector Machine (SVM).Through two-phase classification,B6-SVM can detect anomalies with high detection rate and low false alarm rate.The test results demonstrate the effectiveness and potential of our technique in diagnosing anomalies.Further,compared to previous feature-based anomaly detection approaches,B6-SVM provides a framework to automatically identify possible anomalous types.The framework of B6-SVM is generic and therefore,we expect the derived insights will be helpful for similar future research efforts.展开更多
The condition of the road infrastructure has severe impacts on the road safety, driving comfort, and on the rolling resistance. Therefore, the road infrastructure must be moni- tored comprehensively and in regular int...The condition of the road infrastructure has severe impacts on the road safety, driving comfort, and on the rolling resistance. Therefore, the road infrastructure must be moni- tored comprehensively and in regular intervals to identify damaged road segments and road hazards. Methods have been developed to comprehensively and automatically digitize the road infrastructure and estimate the road quality, which are based on vehicle sensors and a supervised machine learning classification. Since different types of vehicles have various suspension systems with different response functions, one classifier cannot be taken over to other vehicles. Usually, a high amount of time is needed to acquire training data for each individual vehicle and classifier. To address this problem, the methods to collect training data automatically for new vehicles based on the comparison of trajectories of untrained and trained vehicles have been developed. The results show that the method based on a k-dimensional tree and Euclidean distance performs best and is robust in transferring the information of the road surface from one vehicle to another. Furthermore, this method offers the possibility to merge the output and road infrastructure information from multiple vehicles to enable a more robust and precise prediction of the ground truth.展开更多
基金supported by the Fundamental Research Funds for University of Science and Technology Beijing(FRF-BR-12-021)
文摘A semi-supervised vector machine is a relatively new learning method using both labeled and unlabeled data in classifi- cation. Since the objective function of the model for an unstrained semi-supervised vector machine is not smooth, many fast opti- mization algorithms cannot be applied to solve the model. In order to overcome the difficulty of dealing with non-smooth objective functions, new methods that can solve the semi-supervised vector machine with desired classification accuracy are in great demand. A quintic spline function with three-times differentiability at the ori- gin is constructed by a general three-moment method, which can be used to approximate the symmetric hinge loss function. The approximate accuracy of the quintic spiine function is estimated. Moreover, a quintic spline smooth semi-support vector machine is obtained and the convergence accuracy of the smooth model to the non-smooth one is analyzed. Three experiments are performed to test the efficiency of the model. The experimental results show that the new model outperforms other smooth models, in terms of classification performance. Furthermore, the new model is not sensitive to the increasing number of the labeled samples, which means that the new model is more efficient.
基金supported financially by the Ministerio de Ciencia e Innovación(Spain)and the European Regional Development Fund under the Research Grant WindSound Project(Ref.:PID2021-125278OB-I00).
文摘Maintenance operations have a critical influence on power gen-eration by wind turbines(WT).Advanced algorithms must analyze large volume of data from condition monitoring systems(CMS)to determine the actual working conditions and avoid false alarms.This paper proposes different support vector machine(SVM)algorithms for the prediction and detection of false alarms.K-Fold cross-validation(CV)is applied to evaluate the classification reliability of these algorithms.Supervisory Control and Data Acquisition(SCADA)data from an operating WT are applied to test the proposed approach.The results from the quadratic SVM showed an accuracy rate of 98.6%.Misclassifications from the confusion matrix,alarm log and maintenance records are analyzed to obtain quantitative information and determine if it is a false alarm.The classifier reduces the number of false alarms called misclassifications by 25%.These results demonstrate that the proposed approach presents high reliability and accuracy in false alarm identification.
基金supported by University of Macao Research Grant,China (Grant No. RG057/08-09S/VCM/FST, Grant No. UL011/09-Y1/ EME/ WPK01/FST)
文摘Engine spark ignition is an important source for diagnosis of engine faults.Based on the waveform of the ignition pattern,a mechanic can guess what may be the potential malfunctioning parts of an engine with his/her experience and handbooks.However,this manual diagnostic method is imprecise because many spark ignition patterns are very similar.Therefore,a diagnosis needs many trials to identify the malfunctioning parts.Meanwhile the mechanic needs to disassemble and assemble the engine parts for verification.To tackle this problem,an intelligent diagnosis system was established based on ignition patterns.First,the captured patterns were normalized and compressed.Then wavelet packet transform(WPT) was employed to extract the representative features of the ignition patterns.Finally,a classification system was constructed by using multi-class support vector machines(SVM) and the extracted features.The classification system can intelligently classify the most likely engine fault so as to reduce the number of diagnosis trials.Experimental results show that SVM produces higher diagnosis accuracy than the traditional multilayer feedforward neural network.This is the first trial on the combination of WPT and SVM to analyze ignition patterns and diagnose automotive engines.
文摘The paper is related to the error analysis of Multicategory Support Vector Machine (MSVM) classifiers based on reproducing kernel Hilbert spaces. We choose the polynomial kernel as Mercer kernel and give the error estimate with De La Vall6e Poussin means. We also introduce the standard estimation of sample error, and derive the explicit learning rate.
基金National Key Research and Development Program of China(No.2016YFF0103604)National Natural Science Foundations of China(Nos.61171165,11431015,61571230)+1 种基金National Scientific Equipment Developing Project of China(No.2012YQ050250)Natural Science Foundation of Jiangsu Province,China(No.BK20161500)
文摘A comprehensive assessment of the spatial-aware supervised learning algorithms for hyper-spectral image(HSI)classification was presented.For this purpose,standard support vector machines(SVMs),multinomial logistic regression(MLR)and sparse representation(SR) based supervised learning algorithm were compared both theoretically and experimentally.Performance of the discussed techniques was evaluated in terms of overall accuracy,average accuracy,kappa statistic coefficients,and sparsity of the solutions.Execution time,the computational burden,and the capability of the methods were investigated by using probabilistic analysis.For validating the accuracy a classical benchmark AVIRIS Indian pines data set was used.Experiments show that integrating spectral-spatial context can further improve the accuracy,reduce the misclassification error although the cost of computational time will be increased.
基金supported by the National Science and Technology Major Project of the Ministry of Science and Technology of China(2014 ZX03001027)
文摘There are various heterogeneous networks for terminals to deliver a better quality of service. Signal system recognition and classification contribute a lot to the process. However, in low signal to noise ratio(SNR) circumstances or under time-varying multipath channels, the majority of the existing algorithms for signal recognition are already facing limitations. In this series, we present a robust signal recognition method based upon the original and latest updated version of the extreme learning machine(ELM) to help users to switch between networks. The ELM utilizes signal characteristics to distinguish systems. The superiority of this algorithm lies in the random choices of hidden nodes and in the fact that it determines the output weights analytically, which result in lower complexity. Theoretically, the algorithm tends to offer a good generalization performance at an extremely fast speed of learning. Moreover, we implement the GSM/WCDMA/LTE models in the Matlab environment by using the Simulink tools. The simulations reveal that the signals can be recognized successfully to achieve a 95% accuracy in a low SNR(0 dB) environment in the time-varying multipath Rayleigh fading channel.
文摘Machine learning and artificial intelligence approaches have rapidly gained popularity for use in many subsurface energy applications.They are seen as novel methods that may enhance existing capabilities,providing for improved efficiency in exploration and production operations.Furthermore,their inte-gration into reservoir management workflows may shape the future landscape of the energy industry.This study implements a framework that generates predictive models using multiple machine learning classification-based algorithms which can identify specific stratigraphic units(i.e.,formations)as a function of total vertical depth and spatial positioning.The framework is applied in a case study to 13 specific formations of interest(Upper Spraberry through Atoka/Morrow reservoirs)in the Midland Basin,West Texas,United States;a prominent hydrocarbon producing sub-basin of the larger Permian Basin.The study dataset consists of over 275,000 records and includes data fields like formation iden-tifier,true vertical depth(in feet)of formations observed,and latitude and longitude coordinates(in decimal degrees).A subset of 134,374 data records were relevant to the 13 distinct formations of interest and were extracted and used for machine learning model training,validation,and testing.Four super-vised learning approaches including random forest(RF),gradient boosting(GB),support vector machine(SVM),and multilayer perceptron neural network(MLP)were evaluated and their prediction accuracy compared.The best performing model was ultimately built on the RF algorithm and is capable of an overall prediction accuracy of 93 percent on holdout data.The RF-based model demonstrated high prediction accuracy for major oil and gas producing zones including the San Andres,Upper Spraberry,Lower Spraberry,Clearfork,and Wolfcamp at 98,94,89,94,and 94 percent respectively.Overall,the resulting data-driven model provides a robust,cost-effective approach which can complement contemporary reservoir management approaches for multiple subsurface energy applications.
基金supported by Czech Science Foundation(No.GACR13-08195S)the project Central Register of Research Intentions CEZMSM0021630528 Security-oriented Research in Information Technology,the specific research(No.FIT-S-11-2)+2 种基金the project RVO:67985815the Technological agency of the Czech Republic(TACR)project V3C(No.TE01020415)Grant Agency of the Czech Republic-GACR P103/13/08195S
文摘Advances in the technology of astronomical spectra acquisition have resulted in an enormous amount of data available in world-wide telescope archives. It is no longer feasible to analyze them using classical approaches, so a new astronomical discipline,astroinformatics, has emerged. We describe the initial experiments in the investigation of spectral line profiles of emission line stars using machine learning with attempt to automatically identify Be and B[e] stars spectra in large archives and classify their types in an automatic manner. Due to the size of spectra collections, the dimension reduction techniques based on wavelet transformation are studied as well. The result clearly justifies that machine learning is able to distinguish different shapes of line profiles even after drastic dimension reduction.
文摘Support vector machines and a Kalman-like observer are used for fault detection and isolation in a variable speed horizontalaxis wind turbine composed of three blades and a full converter. The support vector approach is data-based and is therefore robust to process knowledge. It is based on structural risk minimization which enhances generalization even with small training data set and it allows for process nonlinearity by using flexible kernels. In this work, a radial basis function is used as the kernel. Different parts of the process are investigated including actuators and sensors faults. With duplicated sensors, sensor faults in blade pitch positions,generator and rotor speeds can be detected. Faults of type stuck measurements can be detected in 2 sampling periods. The detection time of offset/scaled measurements depends on the severity of the fault and on the process dynamics when the fault occurs. The converter torque actuator fault can be detected within 2 sampling periods. Faults in the actuators of the pitch systems represents a higher difficulty for fault detection which is due to the fact that such faults only affect the transitory state(which is very fast) but not the final stationary state. Therefore, two methods are considered and compared for fault detection and isolation of this fault: support vector machines and a Kalman-like observer. Advantages and disadvantages of each method are discussed. On one hand, support vector machines training of transitory states would require a big amount of data in different situations, but the fault detection and isolation results are robust to variations in the input/operating point. On the other hand, the observer is model-based, and therefore does not require training, and it allows identification of the fault level, which is interesting for fault reconfiguration. But the observability of the system is ensured under specific conditions, related to the dynamics of the inputs and outputs. The whole fault detection and isolation scheme is evaluated using a wind turbine benchmark with a real sequence of wind speed.
基金supported by the National Natural Science Foundation of China(Nos.61174103 and61603032)the National Key Technologies R&D Program of China(No.2015BAK38B01)+2 种基金the National Key Research and Development Program of China(No.2017YFB0702300)the China Postdoctoral Science Foundation(No.2016M590048)the University of Science and Technology Beijing–Taipei University of Technology Joint Research Program(TW201705)
文摘The Extreme Learning Machine(ELM) is an effective learning algorithm for a Single-Layer Feedforward Network(SLFN). It performs well in managing some problems due to its fast learning speed. However, in practical applications, its performance might be affected by the noise in the training data. To tackle the noise issue, we propose a novel heterogeneous ensemble of ELMs in this article. Specifically, the correntropy is used to achieve insensitive performance to outliers, while implementing Negative Correlation Learning(NCL) to enhance diversity among the ensemble. The proposed Heterogeneous Ensemble of ELMs(HE2 LM) for classification has different ELM algorithms including the Regularized ELM(RELM), the Kernel ELM(KELM), and the L2-norm-optimized ELM(ELML2). The ensemble is constructed by training a randomly selected ELM classifier on a subset of the training data selected through random resampling. Then, the class label of unseen data is predicted using a maximum weighted sum approach. After splitting the training data into subsets, the proposed HE2 LM is tested through classification and regression tasks on real-world benchmark datasets and synthetic datasets. Hence, the simulation results show that compared with other algorithms, our proposed method can achieve higher prediction accuracy, better generalization, and less sensitivity to outliers.
基金supported by the National Basic Research 973 Program of China under Grant No. 2009CB320505the National Science and Technology Supporting Plan of China under Grant No. 2008BAH37B05+2 种基金the National Natural Science Foundation of China under Grant No. 61170211the Ph.D. Programs Foundation of Ministry of Education of China under Grant No. 20110002110056the National High Technology Research and Development 863 Program of China under Grant Nos. 2008AA01A303 and 2009AA01Z251
文摘Network traffic anomalies are unusual changes in a network,so diagnosing anomalies is important for network management.Feature-based anomaly detection models (ab)normal network traffic behavior by analyzing packet header features.PCA-subspace method (Principal Component Analysis) has been verified as an efficient feature-based way in network-wide anomaly detection.Despite the powerful ability of PCA-subspace method for network-wide traffic detection,it cannot be effectively used for detection on a single link.In this paper,different from most works focusing on detection on flow-level traffic,based on observations of six traffic features for packet-level traffic,we propose a new approach B6SVM to detect anomalies for packet-level traffic on a single link.The basic idea of B6-SVM is to diagnose anomalies in a multi-dimensional view of traffic features using Support Vector Machine (SVM).Through two-phase classification,B6-SVM can detect anomalies with high detection rate and low false alarm rate.The test results demonstrate the effectiveness and potential of our technique in diagnosing anomalies.Further,compared to previous feature-based anomaly detection approaches,B6-SVM provides a framework to automatically identify possible anomalous types.The framework of B6-SVM is generic and therefore,we expect the derived insights will be helpful for similar future research efforts.
基金project of Technical Aspects of Monitoring the Acoustic Quality of Infrastructure in Road Transport(3714541000)commissioned by the German Federal Environment Agencyfunded by the Federal Ministry for the Environment,Nature Conservation,Building and Nuclear Safety,Germany,within the Environmental Research Plan 2014.
文摘The condition of the road infrastructure has severe impacts on the road safety, driving comfort, and on the rolling resistance. Therefore, the road infrastructure must be moni- tored comprehensively and in regular intervals to identify damaged road segments and road hazards. Methods have been developed to comprehensively and automatically digitize the road infrastructure and estimate the road quality, which are based on vehicle sensors and a supervised machine learning classification. Since different types of vehicles have various suspension systems with different response functions, one classifier cannot be taken over to other vehicles. Usually, a high amount of time is needed to acquire training data for each individual vehicle and classifier. To address this problem, the methods to collect training data automatically for new vehicles based on the comparison of trajectories of untrained and trained vehicles have been developed. The results show that the method based on a k-dimensional tree and Euclidean distance performs best and is robust in transferring the information of the road surface from one vehicle to another. Furthermore, this method offers the possibility to merge the output and road infrastructure information from multiple vehicles to enable a more robust and precise prediction of the ground truth.