This paper addresses the significance of preprocessing big data collected during a tunnel boring machine(TBM)excavation before it is used for machine learning on various TBM performance predictions.The research work i...This paper addresses the significance of preprocessing big data collected during a tunnel boring machine(TBM)excavation before it is used for machine learning on various TBM performance predictions.The research work is based on two water diversion tunneling projects that cover 29.52 km and 17051 boring cycles.It has been found that the penetration rate calculated from the raw measured penetration distances exhibits more random behavior owing to their percussive and vibratory behavior of the cutterhead.A moving average method to process the negative instantaneous velocities and a noise reduction filter to deal with signals with abnormal frequencies have been recommended.An index called the drilling efficiency index is introduced to assess the relationships between the mechanical parameters in a boring cycle,whose linear regression coefficient R^(2)is taken for a preliminary investigation of possible problems requiring preprocessing.The research work defines the irrelevant data whose errors are caused by human or mechanical mistakes,and therefore should be cleaned or amended.These irrelevant data can be divided into five categories:(1)premature cycles,(2)sensor defects,(3)mechanical defects,(4)human interruption,and(5)missing files.A program TBM-Processing has been coded for the recognition and classification of these categories.PDF books generated by the program have been uploaded at GitHub to encourage discussions,collaboration,and upgrading of the data processing work with our peers.展开更多
Network traffic anomalies are unusual changes in a network,so diagnosing anomalies is important for network management.Feature-based anomaly detection models (ab)normal network traffic behavior by analyzing packet h...Network traffic anomalies are unusual changes in a network,so diagnosing anomalies is important for network management.Feature-based anomaly detection models (ab)normal network traffic behavior by analyzing packet header features.PCA-subspace method (Principal Component Analysis) has been verified as an efficient feature-based way in network-wide anomaly detection.Despite the powerful ability of PCA-subspace method for network-wide traffic detection,it cannot be effectively used for detection on a single link.In this paper,different from most works focusing on detection on flow-level traffic,based on observations of six traffic features for packet-level traffic,we propose a new approach B6SVM to detect anomalies for packet-level traffic on a single link.The basic idea of B6-SVM is to diagnose anomalies in a multi-dimensional view of traffic features using Support Vector Machine (SVM).Through two-phase classification,B6-SVM can detect anomalies with high detection rate and low false alarm rate.The test results demonstrate the effectiveness and potential of our technique in diagnosing anomalies.Further,compared to previous feature-based anomaly detection approaches,B6-SVM provides a framework to automatically identify possible anomalous types.The framework of B6-SVM is generic and therefore,we expect the derived insights will be helpful for similar future research efforts.展开更多
基金support from the National Program on Key Basic Research Project(973 Program,No.2015CB058100)of China and China Railway Engineering Equipment Group Corporationsupported by the Key Research Project of China Institute of Water Resources and Hydropower Research Limited(Grant Nos.HTGE0203A03201900000,HTGE0203A20202000000)Natural Science Foundation of Shaanxi Province(Grant Nos.2019JLZ-13,2019JLP-23).
文摘This paper addresses the significance of preprocessing big data collected during a tunnel boring machine(TBM)excavation before it is used for machine learning on various TBM performance predictions.The research work is based on two water diversion tunneling projects that cover 29.52 km and 17051 boring cycles.It has been found that the penetration rate calculated from the raw measured penetration distances exhibits more random behavior owing to their percussive and vibratory behavior of the cutterhead.A moving average method to process the negative instantaneous velocities and a noise reduction filter to deal with signals with abnormal frequencies have been recommended.An index called the drilling efficiency index is introduced to assess the relationships between the mechanical parameters in a boring cycle,whose linear regression coefficient R^(2)is taken for a preliminary investigation of possible problems requiring preprocessing.The research work defines the irrelevant data whose errors are caused by human or mechanical mistakes,and therefore should be cleaned or amended.These irrelevant data can be divided into five categories:(1)premature cycles,(2)sensor defects,(3)mechanical defects,(4)human interruption,and(5)missing files.A program TBM-Processing has been coded for the recognition and classification of these categories.PDF books generated by the program have been uploaded at GitHub to encourage discussions,collaboration,and upgrading of the data processing work with our peers.
基金supported by the National Basic Research 973 Program of China under Grant No. 2009CB320505the National Science and Technology Supporting Plan of China under Grant No. 2008BAH37B05+2 种基金the National Natural Science Foundation of China under Grant No. 61170211the Ph.D. Programs Foundation of Ministry of Education of China under Grant No. 20110002110056the National High Technology Research and Development 863 Program of China under Grant Nos. 2008AA01A303 and 2009AA01Z251
文摘Network traffic anomalies are unusual changes in a network,so diagnosing anomalies is important for network management.Feature-based anomaly detection models (ab)normal network traffic behavior by analyzing packet header features.PCA-subspace method (Principal Component Analysis) has been verified as an efficient feature-based way in network-wide anomaly detection.Despite the powerful ability of PCA-subspace method for network-wide traffic detection,it cannot be effectively used for detection on a single link.In this paper,different from most works focusing on detection on flow-level traffic,based on observations of six traffic features for packet-level traffic,we propose a new approach B6SVM to detect anomalies for packet-level traffic on a single link.The basic idea of B6-SVM is to diagnose anomalies in a multi-dimensional view of traffic features using Support Vector Machine (SVM).Through two-phase classification,B6-SVM can detect anomalies with high detection rate and low false alarm rate.The test results demonstrate the effectiveness and potential of our technique in diagnosing anomalies.Further,compared to previous feature-based anomaly detection approaches,B6-SVM provides a framework to automatically identify possible anomalous types.The framework of B6-SVM is generic and therefore,we expect the derived insights will be helpful for similar future research efforts.