For a tunnel driven by a shield machine,the posture of the driving machine is essential to the construction quality and environmental impact.However,the machine posture is controlled by the experienced driver of shiel...For a tunnel driven by a shield machine,the posture of the driving machine is essential to the construction quality and environmental impact.However,the machine posture is controlled by the experienced driver of shield machine by setting hundreds of tunneling parameters empirically.Machine learning(ML)algorithm is an alternative method that can let the computer to learn from the driver’s operation and try to model the relationship between parameters automatically.Thus,in this paper,three ML algorithms,i.e.multi-layer perception(MLP),support vector machine(SVM)and gradient boosting regression(GBR),are improved by genetic algorithm(GA)and principal component analysis(PCA)to predict the tunneling posture of the shield machine.A set of the parameters for shield tunneling is extracted from the construction site of a Shanghai metro.In total,53,785 pairwise data points are collected for about 373 d and the ratio between training set,validation set and test set is 3:1:1.Each pairwise data point includes 83 types of parameters covering the shield posture,construction parameters,and soil stratum properties at the same time.The test results show that the averaged R2 of MLP,SVM and GBR based models are 0.942,0.935 and 0.6,respectively.Then the automatic control for the posture of shield tunnel is illustrated with an application example of the proposed models.The proposed method is proved to be helpful in controlling the construction quality with optimized construction parameters.展开更多
This study integrates different machine learning(ML) methods and 5-fold cross-validation(CV) method to estimate the ground maximal surface settlement(MSS) induced by tunneling.We further investigate the applicability ...This study integrates different machine learning(ML) methods and 5-fold cross-validation(CV) method to estimate the ground maximal surface settlement(MSS) induced by tunneling.We further investigate the applicability of artificial intelligent(AI) based prediction through a comparative study of two tunnelling datasets with different sizes and features.Four different ML approaches,including support vector machine(SVM),random forest(RF),back-propagation neural network(BPNN),and deep neural network(DNN),are utilized.Two techniques,i.e.particle swarm optimization(PSO) and grid search(GS)methods,are adopted for hyperparameter optimization.To assess the reliability and efficiency of the predictions,three performance evaluation indicators,including the mean absolute error(MAE),root mean square error(RMSE),and Pearson correlation coefficient(R),are calculated.Our results indicate that proposed models can accurately and efficiently predict the settlement,while the RF model outperforms the other three methods on both datasets.The difference in model performance on two datasets(Datasets A and B) reveals the importance of data quality and quantity.Sensitivity analysis indicates that Dataset A is more significantly affected by geological conditions,while geometric characteristics play a more dominant role on Dataset B.展开更多
Within any scientific disciplines, a large amount of data are buried within various literature depositories and archives, making it difficult to manually extract useful information from the datum swamps. The machine-l...Within any scientific disciplines, a large amount of data are buried within various literature depositories and archives, making it difficult to manually extract useful information from the datum swamps. The machine-learning extraction of data therefore is necessary for the big-data-based studies. Here, we develop a new text-mining technique to reconstruct the global database of the Precambrian to Recent stromatolites, providing better understanding of secular changes of stromatolites though geological time. The step-by-step data extraction process is described as below. First, the PDF documents of stromatolite-containing literatures were collected, and converted into text formation. Second, a glossary and tag-labeling system using NLP(Natural Language Processing) software was employed to search for all possible candidate pairs from each sentence within the papers collected here. Third, each candidate pair and features were represented as a factor graph model using a series of heuristic procedures to score the weights of each pair feature. Occurrence data of stromatolites versus stratigraphical units(abbreviated as Strata), facies types, locations, and age worldwide were extracted from literatures, respectively, and their extraction accuracies are 92%/464, 87%/778, 92%/846, and 93%/405 from 3 750 scientific abstracts, respectively, and are 90%/1 734, 86%/2 869, 90%/2 055 and 91%/857 from 11 932 papers, respectively. A total of 10 072 unique datum items were identified. The newly obtained stromatolite dataset demonstrates that their stratigraphical occurrences reached a pronounced peak during the Proterozoic(2 500 – 541 Ma), followed by a distinct fall during the Early Phanerozoic, and overall fluctuations through the Phanerozoic(541–0 Ma). Globally, seven stromatolite hotspots were identified from the new dataset, including western United States, eastern United States, western Europe, India, South Africa, northern China, and southern China. The proportional occurrences of inland aquatic stromatolites remain rather low(~20%) in comparison to marine stromatolites from the Precambrian to Jurassic, and then display a significant increase(30%–70%) from the Cretaceous to the present.展开更多
This review summarizes the research outcomes and findings documented in 45 journal papers using a shared tunnel boring machine(TBM)dataset for performance prediction and boring efficiency optimization using machine le...This review summarizes the research outcomes and findings documented in 45 journal papers using a shared tunnel boring machine(TBM)dataset for performance prediction and boring efficiency optimization using machine learning methods.The big dataset was col-lected during the Yinsong water diversion project construction in China,covering the tunnel excavation of a 20 km-section with 199 items of monitoring metrics taken with an interval of one second.The research papers were the result of a call for contributions during a TBM machine learning contest in 2019 and covered a variety of topics related to the intelligent construction of TBM.This review com-prises two parts.Part I is concerned with the data processing,feature extraction,and machine learning methods applied by the contrib-utors.The review finds that the data-driven and knowledge-driven approaches in extracting important features applied by various authors are diversified,requiring further studies to achieve commonly accepted criteria.The techniques for cleaning and amending the raw data adopted by the contributors were summarized,indicating some highlights such as the importance of sufficiently high fre-quency of data acquisition(higher than 1 second),classification and standardization for the data preprocessing process,and the appro-priate selections of features in a boring cycle.The review finds that both supervised and unsupervised machine learning methods have been utilized by various researchers.The ensemble and deep learning methods have found wide applications.Part I highlights the impor-tant features of the individual methods applied by the contributors,including the structures of the algorithm,selection of hyperparam-eters,and model validation approaches.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.52130805 and 51978516)Scientific Program of Shanghai Science and Technology Committee(Grant No.20dz1202200).
文摘For a tunnel driven by a shield machine,the posture of the driving machine is essential to the construction quality and environmental impact.However,the machine posture is controlled by the experienced driver of shield machine by setting hundreds of tunneling parameters empirically.Machine learning(ML)algorithm is an alternative method that can let the computer to learn from the driver’s operation and try to model the relationship between parameters automatically.Thus,in this paper,three ML algorithms,i.e.multi-layer perception(MLP),support vector machine(SVM)and gradient boosting regression(GBR),are improved by genetic algorithm(GA)and principal component analysis(PCA)to predict the tunneling posture of the shield machine.A set of the parameters for shield tunneling is extracted from the construction site of a Shanghai metro.In total,53,785 pairwise data points are collected for about 373 d and the ratio between training set,validation set and test set is 3:1:1.Each pairwise data point includes 83 types of parameters covering the shield posture,construction parameters,and soil stratum properties at the same time.The test results show that the averaged R2 of MLP,SVM and GBR based models are 0.942,0.935 and 0.6,respectively.Then the automatic control for the posture of shield tunnel is illustrated with an application example of the proposed models.The proposed method is proved to be helpful in controlling the construction quality with optimized construction parameters.
基金supported by the Natural Sciences and Engineering Research Council of Canada(NSERC)-Discovery Grant(Grant No.RGPIN-2019-06471)the McMaster University Engineering Life Event Fund。
文摘This study integrates different machine learning(ML) methods and 5-fold cross-validation(CV) method to estimate the ground maximal surface settlement(MSS) induced by tunneling.We further investigate the applicability of artificial intelligent(AI) based prediction through a comparative study of two tunnelling datasets with different sizes and features.Four different ML approaches,including support vector machine(SVM),random forest(RF),back-propagation neural network(BPNN),and deep neural network(DNN),are utilized.Two techniques,i.e.particle swarm optimization(PSO) and grid search(GS)methods,are adopted for hyperparameter optimization.To assess the reliability and efficiency of the predictions,three performance evaluation indicators,including the mean absolute error(MAE),root mean square error(RMSE),and Pearson correlation coefficient(R),are calculated.Our results indicate that proposed models can accurately and efficiently predict the settlement,while the RF model outperforms the other three methods on both datasets.The difference in model performance on two datasets(Datasets A and B) reveals the importance of data quality and quantity.Sensitivity analysis indicates that Dataset A is more significantly affected by geological conditions,while geometric characteristics play a more dominant role on Dataset B.
基金supported by three grants from the National Natural Science Foundation of China (Nos.41821001,41902315,41930322)。
文摘Within any scientific disciplines, a large amount of data are buried within various literature depositories and archives, making it difficult to manually extract useful information from the datum swamps. The machine-learning extraction of data therefore is necessary for the big-data-based studies. Here, we develop a new text-mining technique to reconstruct the global database of the Precambrian to Recent stromatolites, providing better understanding of secular changes of stromatolites though geological time. The step-by-step data extraction process is described as below. First, the PDF documents of stromatolite-containing literatures were collected, and converted into text formation. Second, a glossary and tag-labeling system using NLP(Natural Language Processing) software was employed to search for all possible candidate pairs from each sentence within the papers collected here. Third, each candidate pair and features were represented as a factor graph model using a series of heuristic procedures to score the weights of each pair feature. Occurrence data of stromatolites versus stratigraphical units(abbreviated as Strata), facies types, locations, and age worldwide were extracted from literatures, respectively, and their extraction accuracies are 92%/464, 87%/778, 92%/846, and 93%/405 from 3 750 scientific abstracts, respectively, and are 90%/1 734, 86%/2 869, 90%/2 055 and 91%/857 from 11 932 papers, respectively. A total of 10 072 unique datum items were identified. The newly obtained stromatolite dataset demonstrates that their stratigraphical occurrences reached a pronounced peak during the Proterozoic(2 500 – 541 Ma), followed by a distinct fall during the Early Phanerozoic, and overall fluctuations through the Phanerozoic(541–0 Ma). Globally, seven stromatolite hotspots were identified from the new dataset, including western United States, eastern United States, western Europe, India, South Africa, northern China, and southern China. The proportional occurrences of inland aquatic stromatolites remain rather low(~20%) in comparison to marine stromatolites from the Precambrian to Jurassic, and then display a significant increase(30%–70%) from the Cretaceous to the present.
基金supported by the National Key R&D Program of China(Grant No.2018YFB1702504)the National Natural Science Foundation of China(Grant Nos.52179121,51879284)+3 种基金the State Key Laboratory of Simulations and Regulation of Water Cycle in River Basin,China(Grant No.SKL2022ZD05)the IWHR Research&Development Support Program,China(Grant No.GE0145B012021)the Natural Science Foundation of Shaanxi Province,China(Grant No.2021JLM-50)the National Key R&D Program of China(Grant No.2022YFE0200400).
文摘This review summarizes the research outcomes and findings documented in 45 journal papers using a shared tunnel boring machine(TBM)dataset for performance prediction and boring efficiency optimization using machine learning methods.The big dataset was col-lected during the Yinsong water diversion project construction in China,covering the tunnel excavation of a 20 km-section with 199 items of monitoring metrics taken with an interval of one second.The research papers were the result of a call for contributions during a TBM machine learning contest in 2019 and covered a variety of topics related to the intelligent construction of TBM.This review com-prises two parts.Part I is concerned with the data processing,feature extraction,and machine learning methods applied by the contrib-utors.The review finds that the data-driven and knowledge-driven approaches in extracting important features applied by various authors are diversified,requiring further studies to achieve commonly accepted criteria.The techniques for cleaning and amending the raw data adopted by the contributors were summarized,indicating some highlights such as the importance of sufficiently high fre-quency of data acquisition(higher than 1 second),classification and standardization for the data preprocessing process,and the appro-priate selections of features in a boring cycle.The review finds that both supervised and unsupervised machine learning methods have been utilized by various researchers.The ensemble and deep learning methods have found wide applications.Part I highlights the impor-tant features of the individual methods applied by the contributors,including the structures of the algorithm,selection of hyperparam-eters,and model validation approaches.