Long runout landslides involve a massive amount of energy and can be extremely hazardous owing to their long movement distance,high mobility and strong destructive power.Numerical methods have been widely used to pred...Long runout landslides involve a massive amount of energy and can be extremely hazardous owing to their long movement distance,high mobility and strong destructive power.Numerical methods have been widely used to predict the landslide runout but a fundamental problem remained is how to determine the reliable numerical parameters.This study proposes a framework to predict the runout of potential landslides through multi-source data collaboration and numerical analysis of historical landslide events.Specifically,for the historical landslide cases,the landslide-induced seismic signal,geophysical surveys,and possible in-situ drone/phone videos(multi-source data collaboration)can validate the numerical results in terms of landslide dynamics and deposit features and help calibrate the numerical(rheological)parameters.Subsequently,the calibrated numerical parameters can be used to numerically predict the runout of potential landslides in the region with a similar geological setting to the recorded events.Application of the runout prediction approach to the 2020 Jiashanying landslide in Guizhou,China gives reasonable results in comparison to the field observations.The numerical parameters are determined from the multi-source data collaboration analysis of a historical case in the region(2019 Shuicheng landslide).The proposed framework for landslide runout prediction can be of great utility for landslide risk assessment and disaster reduction in mountainous regions worldwide.展开更多
In the existing landslide susceptibility prediction(LSP)models,the influences of random errors in landslide conditioning factors on LSP are not considered,instead the original conditioning factors are directly taken a...In the existing landslide susceptibility prediction(LSP)models,the influences of random errors in landslide conditioning factors on LSP are not considered,instead the original conditioning factors are directly taken as the model inputs,which brings uncertainties to LSP results.This study aims to reveal the influence rules of the different proportional random errors in conditioning factors on the LSP un-certainties,and further explore a method which can effectively reduce the random errors in conditioning factors.The original conditioning factors are firstly used to construct original factors-based LSP models,and then different random errors of 5%,10%,15% and 20%are added to these original factors for con-structing relevant errors-based LSP models.Secondly,low-pass filter-based LSP models are constructed by eliminating the random errors using low-pass filter method.Thirdly,the Ruijin County of China with 370 landslides and 16 conditioning factors are used as study case.Three typical machine learning models,i.e.multilayer perceptron(MLP),support vector machine(SVM)and random forest(RF),are selected as LSP models.Finally,the LSP uncertainties are discussed and results show that:(1)The low-pass filter can effectively reduce the random errors in conditioning factors to decrease the LSP uncertainties.(2)With the proportions of random errors increasing from 5%to 20%,the LSP uncertainty increases continuously.(3)The original factors-based models are feasible for LSP in the absence of more accurate conditioning factors.(4)The influence degrees of two uncertainty issues,machine learning models and different proportions of random errors,on the LSP modeling are large and basically the same.(5)The Shapley values effectively explain the internal mechanism of machine learning model predicting landslide sus-ceptibility.In conclusion,greater proportion of random errors in conditioning factors results in higher LSP uncertainty,and low-pass filter can effectively reduce these random errors.展开更多
The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was p...The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.展开更多
For milling tool life prediction and health management,accurate extraction and dimensionality reduction of its tool wear features are the key to reduce prediction errors.In this paper,we adopt multi-source information...For milling tool life prediction and health management,accurate extraction and dimensionality reduction of its tool wear features are the key to reduce prediction errors.In this paper,we adopt multi-source information fusion technology to extract and fuse the features of cutting vibration signal,cutting force signal and acoustic emission signal in time domain,frequency domain and time-frequency domain,and downscale the sample features by Pearson correlation coefficient to construct a sample data set;then we propose a tool life prediction model based on CNN-SVM optimized by genetic algorithm(GA),which uses CNN convolutional neural network as the feature learner and SVM support vector machine as the trainer for regression prediction.The results show that the improved model in this paper can effectively predict the tool life with better generalization ability,faster network fitting,and 99.85%prediction accuracy.And compared with the BP model,CNN model,SVM model and CNN-SVM model,the performance of the coefficient of determination R2 metric improved by 4.88%,2.96%,2.53%and 1.34%,respectively.展开更多
In response to the lack of reliable physical parameters in the process simulation of the butadiene extraction,a large amount of phase equilibrium data were collected in the context of the actual process of butadiene p...In response to the lack of reliable physical parameters in the process simulation of the butadiene extraction,a large amount of phase equilibrium data were collected in the context of the actual process of butadiene production by acetonitrile.The accuracy of five prediction methods,UNIFAC(UNIQUAC Functional-group Activity Coefficients),UNIFAC-LL,UNIFAC-LBY,UNIFAC-DMD and COSMO-RS,applied to the butadiene extraction process was verified using partial phase equilibrium data.The results showed that the UNIFAC-DMD method had the highest accuracy in predicting phase equilibrium data for the missing system.COSMO-RS-predicted multiple systems showed good accuracy,and a large number of missing phase equilibrium data were estimated using the UNIFAC-DMD method and COSMO-RS method.The predicted phase equilibrium data were checked for consistency.The NRTL-RK(non-Random Two Liquid-Redlich-Kwong Equation of State)and UNIQUAC thermodynamic models were used to correlate the phase equilibrium data.Industrial device simulations were used to verify the accuracy of the thermodynamic model applied to the butadiene extraction process.The simulation results showed that the average deviations of the simulated results using the correlated thermodynamic model from the actual values were less than 2%compared to that using the commercial simulation software,Aspen Plus and its database.The average deviation was much smaller than that of the simulations using the Aspen Plus database(>10%),indicating that the obtained phase equilibrium data are highly accurate and reliable.The best phase equilibrium data and thermodynamic model parameters for butadiene extraction are provided.This improves the accuracy and reliability of the design,optimization and control of the process,and provides a basis and guarantee for developing a more environmentally friendly and economical butadiene extraction process.展开更多
Reservoir identification and production prediction are two of the most important tasks in petroleum exploration and development.Machine learning(ML)methods are used for petroleum-related studies,but have not been appl...Reservoir identification and production prediction are two of the most important tasks in petroleum exploration and development.Machine learning(ML)methods are used for petroleum-related studies,but have not been applied to reservoir identification and production prediction based on reservoir identification.Production forecasting studies are typically based on overall reservoir thickness and lack accuracy when reservoirs contain a water or dry layer without oil production.In this paper,a systematic ML method was developed using classification models for reservoir identification,and regression models for production prediction.The production models are based on the reservoir identification results.To realize the reservoir identification,seven optimized ML methods were used:four typical single ML methods and three ensemble ML methods.These methods classify the reservoir into five types of layers:water,dry and three levels of oil(I oil layer,II oil layer,III oil layer).The validation and test results of these seven optimized ML methods suggest the three ensemble methods perform better than the four single ML methods in reservoir identification.The XGBoost produced the model with the highest accuracy;up to 99%.The effective thickness of I and II oil layers determined during the reservoir identification was fed into the models for predicting production.Effective thickness considers the distribution of the water and the oil resulting in a more reasonable production prediction compared to predictions based on the overall reservoir thickness.To validate the superiority of the ML methods,reference models using overall reservoir thickness were built for comparison.The models based on effective thickness outperformed the reference models in every evaluation metric.The prediction accuracy of the ML models using effective thickness were 10%higher than that of reference model.Without the personal error or data distortion existing in traditional methods,this novel system realizes rapid analysis of data while reducing the time required to resolve reservoir classification and production prediction challenges.The ML models using the effective thickness obtained from reservoir identification were more accurate when predicting oil production compared to previous studies which use overall reservoir thickness.展开更多
The development of defect prediction plays a significant role in improving software quality. Such predictions are used to identify defective modules before the testing and to minimize the time and cost. The software w...The development of defect prediction plays a significant role in improving software quality. Such predictions are used to identify defective modules before the testing and to minimize the time and cost. The software with defects negatively impacts operational costs and finally affects customer satisfaction. Numerous approaches exist to predict software defects. However, the timely and accurate software bugs are the major challenging issues. To improve the timely and accurate software defect prediction, a novel technique called Nonparametric Statistical feature scaled QuAdratic regressive convolution Deep nEural Network (SQADEN) is introduced. The proposed SQADEN technique mainly includes two major processes namely metric or feature selection and classification. First, the SQADEN uses the nonparametric statistical Torgerson–Gower scaling technique for identifying the relevant software metrics by measuring the similarity using the dice coefficient. The feature selection process is used to minimize the time complexity of software fault prediction. With the selected metrics, software fault perdition with the help of the Quadratic Censored regressive convolution deep neural network-based classification. The deep learning classifier analyzes the training and testing samples using the contingency correlation coefficient. The softstep activation function is used to provide the final fault prediction results. To minimize the error, the Nelder–Mead method is applied to solve non-linear least-squares problems. Finally, accurate classification results with a minimum error are obtained at the output layer. Experimental evaluation is carried out with different quantitative metrics such as accuracy, precision, recall, F-measure, and time complexity. The analyzed results demonstrate the superior performance of our proposed SQADEN technique with maximum accuracy, sensitivity and specificity by 3%, 3%, 2% and 3% and minimum time and space by 13% and 15% when compared with the two state-of-the-art methods.展开更多
This article explores the comparison between the probability method and the least squares method in the design of linear predictive models. It points out that these two approaches have distinct theoretical foundations...This article explores the comparison between the probability method and the least squares method in the design of linear predictive models. It points out that these two approaches have distinct theoretical foundations and can lead to varied or similar results in terms of precision and performance under certain assumptions. The article underlines the importance of comparing these two approaches to choose the one best suited to the context, available data and modeling objectives.展开更多
The development of prediction supports is a critical step in information systems engineering in this era defined by the knowledge economy, the hub of which is big data. Currently, the lack of a predictive model, wheth...The development of prediction supports is a critical step in information systems engineering in this era defined by the knowledge economy, the hub of which is big data. Currently, the lack of a predictive model, whether qualitative or quantitative, depending on a company’s areas of intervention can handicap or weaken its competitive capacities, endangering its survival. In terms of quantitative prediction, depending on the efficacy criteria, a variety of methods and/or tools are available. The multiple linear regression method is one of the methods used for this purpose. A linear regression model is a regression model of an explained variable on one or more explanatory variables in which the function that links the explanatory variables to the explained variable has linear parameters. The purpose of this work is to demonstrate how to use multiple linear regressions, which is one aspect of decisional mathematics. The use of multiple linear regressions on random data, which can be replaced by real data collected by or from organizations, provides decision makers with reliable data knowledge. As a result, machine learning methods can provide decision makers with relevant and trustworthy data. The main goal of this article is therefore to define the objective function on which the influencing factors for its optimization will be defined using the linear regression method.展开更多
In order to make a scientific pavement maintenance decision, a grey-theory-based prediction methodological framework is proposed to predict pavement performance. Based on the field pavement rutting data,analysis of va...In order to make a scientific pavement maintenance decision, a grey-theory-based prediction methodological framework is proposed to predict pavement performance. Based on the field pavement rutting data,analysis of variance (ANOVA)was first used to study the influence of different factors on pavement rutting. Cluster analysis was then employed to investigate the rutting development trend.Based on the clustering results,the grey theory was applied to build pavement rutting models for each cluster, which can effectively reduce the complexity of the predictive model.The results show that axial load and asphalt binder type play important roles in rutting development.The prediction model is capable of capturing the uncertainty in the pavement performance prediction process and can meet the requirements of highway pavement maintenance,and,therefore,has a wide application prospects.展开更多
In order to forecast projectile impact points quickly and accurately,aprojectile impact point prediction method based on generalized regression neural network(GRNN)is presented.Firstly,the model of GRNN forecasting ...In order to forecast projectile impact points quickly and accurately,aprojectile impact point prediction method based on generalized regression neural network(GRNN)is presented.Firstly,the model of GRNN forecasting impact point is established;secondly,the particle swarm algorithm(PSD)is used to optimize the smooth factor in the prediction model and then the optimal GRNN impact point prediction model is obtained.Finally,the numerical simulation of this prediction model is carried out.Simulation results show that the maximum range error is no more than 40 m,and the lateral deviation error is less than0.2m.The average time of impact point prediction is 6.645 ms,which is 1 300.623 ms less than that of numerical integration method.Therefore,it is feasible and effective for the proposed method to forecast projectile impact points,and thus it can provide a theoretical reference for practical engineering applications.展开更多
[Objective] To discuss the effects of major mapping methods for DNA sequence on the accuracy of protein coding regions prediction,and to find out the effective mapping methods.[Method] By taking Approximate Correlatio...[Objective] To discuss the effects of major mapping methods for DNA sequence on the accuracy of protein coding regions prediction,and to find out the effective mapping methods.[Method] By taking Approximate Correlation(AC) as the full measure of the prediction accuracy at nucleotide level,the windowed narrow pass-band filter(WNPBF) based prediction algorithm was applied to study the effects of different mapping methods on prediction accuracy.[Result] In DNA data sets ALLSEQ and HMR195,the Voss and Z-Curve methods are proved to be more effective mapping methods than paired numeric(PN),Electron-ion Interaction Potential(EIIP) and complex number methods.[Conclusion] This study lays the foundation to verify the effectiveness of new mapping methods by using the predicted AC value,and it is meaningful to reveal DNA structure by using bioinformatics methods.展开更多
Urban functional area(UFA)is a core scientific issue affecting urban sustainability.The current knowledge gap is mainly reflected in the lack of multi-scale quantitative interpretation methods from the perspective of ...Urban functional area(UFA)is a core scientific issue affecting urban sustainability.The current knowledge gap is mainly reflected in the lack of multi-scale quantitative interpretation methods from the perspective of human-land interaction.In this paper,based on multi-source big data include 250 m×250 m resolution cell phone data,1.81×105 Points of Interest(POI)data and administrative boundary data,we built a UFA identification method and demonstrated empirically in Shenyang City,China.We argue that the method we built can effectively identify multi-scale multi-type UFAs based on human activity and further reveal the spatial correlation between urban facilities and human activity.The empirical study suggests that the employment functional zones in Shenyang City are more concentrated in central cities than other single functional zones.There are more mix functional areas in the central city areas,while the planned industrial new cities need to develop comprehensive functions in Shenyang.UFAs have scale effects and human-land interaction patterns.We suggest that city decision makers should apply multi-sources big data to measure urban functional service in a more refined manner from a supply-demand perspective.展开更多
To perform landslide susceptibility prediction(LSP),it is important to select appropriate mapping unit and landslide-related conditioning factors.The efficient and automatic multi-scale segmentation(MSS)method propose...To perform landslide susceptibility prediction(LSP),it is important to select appropriate mapping unit and landslide-related conditioning factors.The efficient and automatic multi-scale segmentation(MSS)method proposed by the authors promotes the application of slope units.However,LSP modeling based on these slope units has not been performed.Moreover,the heterogeneity of conditioning factors in slope units is neglected,leading to incomplete input variables of LSP modeling.In this study,the slope units extracted by the MSS method are used to construct LSP modeling,and the heterogeneity of conditioning factors is represented by the internal variations of conditioning factors within slope unit using the descriptive statistics features of mean,standard deviation and range.Thus,slope units-based machine learning models considering internal variations of conditioning factors(variant slope-machine learning)are proposed.The Chongyi County is selected as the case study and is divided into 53,055 slope units.Fifteen original slope unit-based conditioning factors are expanded to 38 slope unit-based conditioning factors through considering their internal variations.Random forest(RF)and multi-layer perceptron(MLP)machine learning models are used to construct variant Slope-RF and Slope-MLP models.Meanwhile,the Slope-RF and Slope-MLP models without considering the internal variations of conditioning factors,and conventional grid units-based machine learning(Grid-RF and MLP)models are built for comparisons through the LSP performance assessments.Results show that the variant Slopemachine learning models have higher LSP performances than Slope-machine learning models;LSP results of variant Slope-machine learning models have stronger directivity and practical application than Grid-machine learning models.It is concluded that slope units extracted by MSS method can be appropriate for LSP modeling,and the heterogeneity of conditioning factors within slope units can more comprehensively reflect the relationships between conditioning factors and landslides.The research results have important reference significance for land use and landslide prevention.展开更多
In the process of using the original key stratum theory to predict the height of a water-flowing fractured zone(WFZ),the influence of rock strata outside the calculation range on the rock strata within the calculation...In the process of using the original key stratum theory to predict the height of a water-flowing fractured zone(WFZ),the influence of rock strata outside the calculation range on the rock strata within the calculation range as well as the fact that the shape of the overburden deformation area will change with the excavation length are ignored.In this paper,an improved key stratum theory(IKS theory)was proposed by fixing these two shortcomings.Then,a WFZ height prediction method based on IKS theory was established and applied.First,the range of overburden involved in the analysis was determined according to the tensile stress distribution range above the goaf.Second,the key stratum in the overburden involved in the analysis was identified through IKS theory.Finally,the tendency of the WFZ to develop upward was determined by judging whether or not the identified key stratum will break.The proposed method was applied and verified in a mining case study,and the reasons for the differences in the development patterns between the WFZs in coalfields in Northwest and East China were also fully explained by this method.展开更多
Based on the hindcast results of summer rainfall anomalies over China for the period 1981-2000 by the Dynamical Climate Prediction System (IAP-DCP) developed by the Institute of Atmospheric Physics, a correction met...Based on the hindcast results of summer rainfall anomalies over China for the period 1981-2000 by the Dynamical Climate Prediction System (IAP-DCP) developed by the Institute of Atmospheric Physics, a correction method that can account for the dependence of model's systematic biases on SST anomalies is proposed. It is shown that this correction method can improve the hindcast skill of the IAP-DCP for summer rainfall anomalies over China, especially in western China and southeast China, which may imply its potential application to real-time seasonal prediction.展开更多
One of the greatest challenges in the design of a gun is to balance muzzle velocity and recoil,especially for guns on aircrafts and deployable vehicles.To resolve the conflict between gun power and recoil force,a conc...One of the greatest challenges in the design of a gun is to balance muzzle velocity and recoil,especially for guns on aircrafts and deployable vehicles.To resolve the conflict between gun power and recoil force,a concept of rarefaction wave gun(RAVEN)was proposed to significantly reduce the weapon recoil and the heat in barrel,while minimally reducing the muzzle velocity.The main principle of RAVEN is that the rarefaction wave will not reach the projectile base until the muzzle by delaying the venting time of an expansion nozzle at the breech.Developed on the RAVEN principle,the purpose of this paper is to provide an engineering method for predicting the performance of a low-recoil gun with front nozzle.First,a two-dimensional two-phase flow model of interior ballistic during the RAVEN firing cycle was established.Numerical simulation results were compared with the published data to validate the reliability and accuracy.Next,the effects of the vent opening times and locations were investigated to determine the influence rules on the performance of the RAVEN with front nozzle.Then according to the results above,simple nonlinear fitting formulas were provided to explain how the muzzle velocity and the recoil force change with the vent opening time and location.Finally,a better vent venting opening time corresponding to the vent location was proposed.The findings should make an important contribution to the field of engineering applications of the RAVEN.展开更多
Eight casing failure modes and 32 risk factors in oil and gas wells are given in this paper. According to the quantitative analysis of the influence degree and occurrence probability of risk factors, the Borda counts ...Eight casing failure modes and 32 risk factors in oil and gas wells are given in this paper. According to the quantitative analysis of the influence degree and occurrence probability of risk factors, the Borda counts for failure modes are obtained with the Borda method. The risk indexes of failure modes are derived from the Borda matrix. Based on the support vector machine (SVM), a casing life prediction model is established. In the prediction model, eight risk indexes are defined as input vectors and casing life is defined as the output vector. The ideal model parameters are determined with the training set from 19 wells with casing failure. The casing life prediction software is developed with the SVM model as a predictor. The residual life of 60 wells with casing failure is predicted with the software, and then compared with the actual casing life. The comparison results show that the casing life prediction software with the SVM model has high accuracy.展开更多
Nitrogen dioxide(NO_(2))poses a critical potential risk to environmental quality and public health.A reliable machine learning(ML)forecasting framework will be useful to provide valuable information to support governm...Nitrogen dioxide(NO_(2))poses a critical potential risk to environmental quality and public health.A reliable machine learning(ML)forecasting framework will be useful to provide valuable information to support government decision-making.Based on the data from1609 air quality monitors across China from 2014-2020,this study designed an ensemble ML model by integrating multiple types of spatial-temporal variables and three sub-models for time-sensitive prediction over a wide range.The ensemble ML model incorporates a residual connection to the gated recurrent unit(GRU)network and adopts the advantage of Transformer,extreme gradient boosting(XGBoost)and GRU with residual connection network,resulting in a 4.1%±1.0%lower root mean square error over XGBoost for the test results.The ensemble model shows great prediction performance,with coefficient of determination of 0.91,0.86,and 0.77 for 1-hr,3-hr,and 24-hr averages for the test results,respectively.In particular,this model has achieved excellent performance with low spatial uncertainty in Central,East,and North China,the major site-dense zones.Through the interpretability analysis based on the Shapley value for different temporal resolutions,we found that the contribution of atmospheric chemical processes is more important for hourly predictions compared with the daily scale predictions,while the impact of meteorological conditions would be ever-prominent for the latter.Compared with existing models for different spatiotemporal scales,the present model can be implemented at any air quality monitoring station across China to facilitate achieving rapid and dependable forecast of NO_(2),which will help developing effective control policies.展开更多
In this paper, an analogue correction method of errors (ACE) based on a complicated atmospheric model is further developed and applied to numerical weather prediction (NWP). The analysis shows that the ACE can eff...In this paper, an analogue correction method of errors (ACE) based on a complicated atmospheric model is further developed and applied to numerical weather prediction (NWP). The analysis shows that the ACE can effectively reduce model errors by combining the statistical analogue method with the dynamical model together in order that the information of plenty of historical data is utilized in the current complicated NWP model, Furthermore, in the ACE, the differences of the similarities between different historical analogues and the current initial state are considered as the weights for estimating model errors. The results of daily, decad and monthly prediction experiments on a complicated T63 atmospheric model show that the performance of the ACE by correcting model errors based on the estimation of the errors of 4 historical analogue predictions is not only better than that of the scheme of only introducing the correction of the errors of every single analogue prediction, but is also better than that of the T63 model.展开更多
基金supported by the National Natural Science Foundation of China(41977215)。
文摘Long runout landslides involve a massive amount of energy and can be extremely hazardous owing to their long movement distance,high mobility and strong destructive power.Numerical methods have been widely used to predict the landslide runout but a fundamental problem remained is how to determine the reliable numerical parameters.This study proposes a framework to predict the runout of potential landslides through multi-source data collaboration and numerical analysis of historical landslide events.Specifically,for the historical landslide cases,the landslide-induced seismic signal,geophysical surveys,and possible in-situ drone/phone videos(multi-source data collaboration)can validate the numerical results in terms of landslide dynamics and deposit features and help calibrate the numerical(rheological)parameters.Subsequently,the calibrated numerical parameters can be used to numerically predict the runout of potential landslides in the region with a similar geological setting to the recorded events.Application of the runout prediction approach to the 2020 Jiashanying landslide in Guizhou,China gives reasonable results in comparison to the field observations.The numerical parameters are determined from the multi-source data collaboration analysis of a historical case in the region(2019 Shuicheng landslide).The proposed framework for landslide runout prediction can be of great utility for landslide risk assessment and disaster reduction in mountainous regions worldwide.
基金This work is funded by the National Natural Science Foundation of China(Grant Nos.42377164 and 52079062)the National Science Fund for Distinguished Young Scholars of China(Grant No.52222905).
文摘In the existing landslide susceptibility prediction(LSP)models,the influences of random errors in landslide conditioning factors on LSP are not considered,instead the original conditioning factors are directly taken as the model inputs,which brings uncertainties to LSP results.This study aims to reveal the influence rules of the different proportional random errors in conditioning factors on the LSP un-certainties,and further explore a method which can effectively reduce the random errors in conditioning factors.The original conditioning factors are firstly used to construct original factors-based LSP models,and then different random errors of 5%,10%,15% and 20%are added to these original factors for con-structing relevant errors-based LSP models.Secondly,low-pass filter-based LSP models are constructed by eliminating the random errors using low-pass filter method.Thirdly,the Ruijin County of China with 370 landslides and 16 conditioning factors are used as study case.Three typical machine learning models,i.e.multilayer perceptron(MLP),support vector machine(SVM)and random forest(RF),are selected as LSP models.Finally,the LSP uncertainties are discussed and results show that:(1)The low-pass filter can effectively reduce the random errors in conditioning factors to decrease the LSP uncertainties.(2)With the proportions of random errors increasing from 5%to 20%,the LSP uncertainty increases continuously.(3)The original factors-based models are feasible for LSP in the absence of more accurate conditioning factors.(4)The influence degrees of two uncertainty issues,machine learning models and different proportions of random errors,on the LSP modeling are large and basically the same.(5)The Shapley values effectively explain the internal mechanism of machine learning model predicting landslide sus-ceptibility.In conclusion,greater proportion of random errors in conditioning factors results in higher LSP uncertainty,and low-pass filter can effectively reduce these random errors.
基金financially supported by the National Key Research and Development Program of China(2022YFB3706800,2020YFB1710100)the National Natural Science Foundation of China(51821001,52090042,52074183)。
文摘The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.
基金financed with the means of Basic Scientific Research Youth Program of Education Department of Liaoning Province,No.LJKQZ2021185Yingkou Enterprise and Doctor Innovation Program (QB-2021-05).
文摘For milling tool life prediction and health management,accurate extraction and dimensionality reduction of its tool wear features are the key to reduce prediction errors.In this paper,we adopt multi-source information fusion technology to extract and fuse the features of cutting vibration signal,cutting force signal and acoustic emission signal in time domain,frequency domain and time-frequency domain,and downscale the sample features by Pearson correlation coefficient to construct a sample data set;then we propose a tool life prediction model based on CNN-SVM optimized by genetic algorithm(GA),which uses CNN convolutional neural network as the feature learner and SVM support vector machine as the trainer for regression prediction.The results show that the improved model in this paper can effectively predict the tool life with better generalization ability,faster network fitting,and 99.85%prediction accuracy.And compared with the BP model,CNN model,SVM model and CNN-SVM model,the performance of the coefficient of determination R2 metric improved by 4.88%,2.96%,2.53%and 1.34%,respectively.
基金supported by the National Natural Science Foundation of China(22178190)。
文摘In response to the lack of reliable physical parameters in the process simulation of the butadiene extraction,a large amount of phase equilibrium data were collected in the context of the actual process of butadiene production by acetonitrile.The accuracy of five prediction methods,UNIFAC(UNIQUAC Functional-group Activity Coefficients),UNIFAC-LL,UNIFAC-LBY,UNIFAC-DMD and COSMO-RS,applied to the butadiene extraction process was verified using partial phase equilibrium data.The results showed that the UNIFAC-DMD method had the highest accuracy in predicting phase equilibrium data for the missing system.COSMO-RS-predicted multiple systems showed good accuracy,and a large number of missing phase equilibrium data were estimated using the UNIFAC-DMD method and COSMO-RS method.The predicted phase equilibrium data were checked for consistency.The NRTL-RK(non-Random Two Liquid-Redlich-Kwong Equation of State)and UNIQUAC thermodynamic models were used to correlate the phase equilibrium data.Industrial device simulations were used to verify the accuracy of the thermodynamic model applied to the butadiene extraction process.The simulation results showed that the average deviations of the simulated results using the correlated thermodynamic model from the actual values were less than 2%compared to that using the commercial simulation software,Aspen Plus and its database.The average deviation was much smaller than that of the simulations using the Aspen Plus database(>10%),indicating that the obtained phase equilibrium data are highly accurate and reliable.The best phase equilibrium data and thermodynamic model parameters for butadiene extraction are provided.This improves the accuracy and reliability of the design,optimization and control of the process,and provides a basis and guarantee for developing a more environmentally friendly and economical butadiene extraction process.
文摘Reservoir identification and production prediction are two of the most important tasks in petroleum exploration and development.Machine learning(ML)methods are used for petroleum-related studies,but have not been applied to reservoir identification and production prediction based on reservoir identification.Production forecasting studies are typically based on overall reservoir thickness and lack accuracy when reservoirs contain a water or dry layer without oil production.In this paper,a systematic ML method was developed using classification models for reservoir identification,and regression models for production prediction.The production models are based on the reservoir identification results.To realize the reservoir identification,seven optimized ML methods were used:four typical single ML methods and three ensemble ML methods.These methods classify the reservoir into five types of layers:water,dry and three levels of oil(I oil layer,II oil layer,III oil layer).The validation and test results of these seven optimized ML methods suggest the three ensemble methods perform better than the four single ML methods in reservoir identification.The XGBoost produced the model with the highest accuracy;up to 99%.The effective thickness of I and II oil layers determined during the reservoir identification was fed into the models for predicting production.Effective thickness considers the distribution of the water and the oil resulting in a more reasonable production prediction compared to predictions based on the overall reservoir thickness.To validate the superiority of the ML methods,reference models using overall reservoir thickness were built for comparison.The models based on effective thickness outperformed the reference models in every evaluation metric.The prediction accuracy of the ML models using effective thickness were 10%higher than that of reference model.Without the personal error or data distortion existing in traditional methods,this novel system realizes rapid analysis of data while reducing the time required to resolve reservoir classification and production prediction challenges.The ML models using the effective thickness obtained from reservoir identification were more accurate when predicting oil production compared to previous studies which use overall reservoir thickness.
文摘The development of defect prediction plays a significant role in improving software quality. Such predictions are used to identify defective modules before the testing and to minimize the time and cost. The software with defects negatively impacts operational costs and finally affects customer satisfaction. Numerous approaches exist to predict software defects. However, the timely and accurate software bugs are the major challenging issues. To improve the timely and accurate software defect prediction, a novel technique called Nonparametric Statistical feature scaled QuAdratic regressive convolution Deep nEural Network (SQADEN) is introduced. The proposed SQADEN technique mainly includes two major processes namely metric or feature selection and classification. First, the SQADEN uses the nonparametric statistical Torgerson–Gower scaling technique for identifying the relevant software metrics by measuring the similarity using the dice coefficient. The feature selection process is used to minimize the time complexity of software fault prediction. With the selected metrics, software fault perdition with the help of the Quadratic Censored regressive convolution deep neural network-based classification. The deep learning classifier analyzes the training and testing samples using the contingency correlation coefficient. The softstep activation function is used to provide the final fault prediction results. To minimize the error, the Nelder–Mead method is applied to solve non-linear least-squares problems. Finally, accurate classification results with a minimum error are obtained at the output layer. Experimental evaluation is carried out with different quantitative metrics such as accuracy, precision, recall, F-measure, and time complexity. The analyzed results demonstrate the superior performance of our proposed SQADEN technique with maximum accuracy, sensitivity and specificity by 3%, 3%, 2% and 3% and minimum time and space by 13% and 15% when compared with the two state-of-the-art methods.
文摘This article explores the comparison between the probability method and the least squares method in the design of linear predictive models. It points out that these two approaches have distinct theoretical foundations and can lead to varied or similar results in terms of precision and performance under certain assumptions. The article underlines the importance of comparing these two approaches to choose the one best suited to the context, available data and modeling objectives.
文摘The development of prediction supports is a critical step in information systems engineering in this era defined by the knowledge economy, the hub of which is big data. Currently, the lack of a predictive model, whether qualitative or quantitative, depending on a company’s areas of intervention can handicap or weaken its competitive capacities, endangering its survival. In terms of quantitative prediction, depending on the efficacy criteria, a variety of methods and/or tools are available. The multiple linear regression method is one of the methods used for this purpose. A linear regression model is a regression model of an explained variable on one or more explanatory variables in which the function that links the explanatory variables to the explained variable has linear parameters. The purpose of this work is to demonstrate how to use multiple linear regressions, which is one aspect of decisional mathematics. The use of multiple linear regressions on random data, which can be replaced by real data collected by or from organizations, provides decision makers with reliable data knowledge. As a result, machine learning methods can provide decision makers with relevant and trustworthy data. The main goal of this article is therefore to define the objective function on which the influencing factors for its optimization will be defined using the linear regression method.
基金The Major Scientific and Technological Special Project of Jiangsu Provincial Communications Department(No.2011Y/02-G1)
文摘In order to make a scientific pavement maintenance decision, a grey-theory-based prediction methodological framework is proposed to predict pavement performance. Based on the field pavement rutting data,analysis of variance (ANOVA)was first used to study the influence of different factors on pavement rutting. Cluster analysis was then employed to investigate the rutting development trend.Based on the clustering results,the grey theory was applied to build pavement rutting models for each cluster, which can effectively reduce the complexity of the predictive model.The results show that axial load and asphalt binder type play important roles in rutting development.The prediction model is capable of capturing the uncertainty in the pavement performance prediction process and can meet the requirements of highway pavement maintenance,and,therefore,has a wide application prospects.
基金Project Funded by Chongqing Changjiang Electrical Appliances Industries Group Co.,Ltd
文摘In order to forecast projectile impact points quickly and accurately,aprojectile impact point prediction method based on generalized regression neural network(GRNN)is presented.Firstly,the model of GRNN forecasting impact point is established;secondly,the particle swarm algorithm(PSD)is used to optimize the smooth factor in the prediction model and then the optimal GRNN impact point prediction model is obtained.Finally,the numerical simulation of this prediction model is carried out.Simulation results show that the maximum range error is no more than 40 m,and the lateral deviation error is less than0.2m.The average time of impact point prediction is 6.645 ms,which is 1 300.623 ms less than that of numerical integration method.Therefore,it is feasible and effective for the proposed method to forecast projectile impact points,and thus it can provide a theoretical reference for practical engineering applications.
基金Supported by Ningxia Natural Science Foundation (NZ1024)the Scientific Research the Project of Ningxia Universities (201027)~~
文摘[Objective] To discuss the effects of major mapping methods for DNA sequence on the accuracy of protein coding regions prediction,and to find out the effective mapping methods.[Method] By taking Approximate Correlation(AC) as the full measure of the prediction accuracy at nucleotide level,the windowed narrow pass-band filter(WNPBF) based prediction algorithm was applied to study the effects of different mapping methods on prediction accuracy.[Result] In DNA data sets ALLSEQ and HMR195,the Voss and Z-Curve methods are proved to be more effective mapping methods than paired numeric(PN),Electron-ion Interaction Potential(EIIP) and complex number methods.[Conclusion] This study lays the foundation to verify the effectiveness of new mapping methods by using the predicted AC value,and it is meaningful to reveal DNA structure by using bioinformatics methods.
基金Under the auspices of Natural Science Foundation of China(No.41971166)。
文摘Urban functional area(UFA)is a core scientific issue affecting urban sustainability.The current knowledge gap is mainly reflected in the lack of multi-scale quantitative interpretation methods from the perspective of human-land interaction.In this paper,based on multi-source big data include 250 m×250 m resolution cell phone data,1.81×105 Points of Interest(POI)data and administrative boundary data,we built a UFA identification method and demonstrated empirically in Shenyang City,China.We argue that the method we built can effectively identify multi-scale multi-type UFAs based on human activity and further reveal the spatial correlation between urban facilities and human activity.The empirical study suggests that the employment functional zones in Shenyang City are more concentrated in central cities than other single functional zones.There are more mix functional areas in the central city areas,while the planned industrial new cities need to develop comprehensive functions in Shenyang.UFAs have scale effects and human-land interaction patterns.We suggest that city decision makers should apply multi-sources big data to measure urban functional service in a more refined manner from a supply-demand perspective.
基金funded by the Natural Science Foundation of China(Grant Nos.41807285,41972280 and 52179103).
文摘To perform landslide susceptibility prediction(LSP),it is important to select appropriate mapping unit and landslide-related conditioning factors.The efficient and automatic multi-scale segmentation(MSS)method proposed by the authors promotes the application of slope units.However,LSP modeling based on these slope units has not been performed.Moreover,the heterogeneity of conditioning factors in slope units is neglected,leading to incomplete input variables of LSP modeling.In this study,the slope units extracted by the MSS method are used to construct LSP modeling,and the heterogeneity of conditioning factors is represented by the internal variations of conditioning factors within slope unit using the descriptive statistics features of mean,standard deviation and range.Thus,slope units-based machine learning models considering internal variations of conditioning factors(variant slope-machine learning)are proposed.The Chongyi County is selected as the case study and is divided into 53,055 slope units.Fifteen original slope unit-based conditioning factors are expanded to 38 slope unit-based conditioning factors through considering their internal variations.Random forest(RF)and multi-layer perceptron(MLP)machine learning models are used to construct variant Slope-RF and Slope-MLP models.Meanwhile,the Slope-RF and Slope-MLP models without considering the internal variations of conditioning factors,and conventional grid units-based machine learning(Grid-RF and MLP)models are built for comparisons through the LSP performance assessments.Results show that the variant Slopemachine learning models have higher LSP performances than Slope-machine learning models;LSP results of variant Slope-machine learning models have stronger directivity and practical application than Grid-machine learning models.It is concluded that slope units extracted by MSS method can be appropriate for LSP modeling,and the heterogeneity of conditioning factors within slope units can more comprehensively reflect the relationships between conditioning factors and landslides.The research results have important reference significance for land use and landslide prevention.
基金supported by the Key Projects of Natural Science Foundation of China(No.41931284)the Scientific Research Start-Up Fund for High-Level Introduced Talents of Anhui University of Science and Technology(No.2022yjrc21).
文摘In the process of using the original key stratum theory to predict the height of a water-flowing fractured zone(WFZ),the influence of rock strata outside the calculation range on the rock strata within the calculation range as well as the fact that the shape of the overburden deformation area will change with the excavation length are ignored.In this paper,an improved key stratum theory(IKS theory)was proposed by fixing these two shortcomings.Then,a WFZ height prediction method based on IKS theory was established and applied.First,the range of overburden involved in the analysis was determined according to the tensile stress distribution range above the goaf.Second,the key stratum in the overburden involved in the analysis was identified through IKS theory.Finally,the tendency of the WFZ to develop upward was determined by judging whether or not the identified key stratum will break.The proposed method was applied and verified in a mining case study,and the reasons for the differences in the development patterns between the WFZs in coalfields in Northwest and East China were also fully explained by this method.
文摘Based on the hindcast results of summer rainfall anomalies over China for the period 1981-2000 by the Dynamical Climate Prediction System (IAP-DCP) developed by the Institute of Atmospheric Physics, a correction method that can account for the dependence of model's systematic biases on SST anomalies is proposed. It is shown that this correction method can improve the hindcast skill of the IAP-DCP for summer rainfall anomalies over China, especially in western China and southeast China, which may imply its potential application to real-time seasonal prediction.
基金supported by the National Natural Science Foundation of China(Grant No.11502114)the Fundamental Research Funds for the Central Universities(Grant No.30918011323)
文摘One of the greatest challenges in the design of a gun is to balance muzzle velocity and recoil,especially for guns on aircrafts and deployable vehicles.To resolve the conflict between gun power and recoil force,a concept of rarefaction wave gun(RAVEN)was proposed to significantly reduce the weapon recoil and the heat in barrel,while minimally reducing the muzzle velocity.The main principle of RAVEN is that the rarefaction wave will not reach the projectile base until the muzzle by delaying the venting time of an expansion nozzle at the breech.Developed on the RAVEN principle,the purpose of this paper is to provide an engineering method for predicting the performance of a low-recoil gun with front nozzle.First,a two-dimensional two-phase flow model of interior ballistic during the RAVEN firing cycle was established.Numerical simulation results were compared with the published data to validate the reliability and accuracy.Next,the effects of the vent opening times and locations were investigated to determine the influence rules on the performance of the RAVEN with front nozzle.Then according to the results above,simple nonlinear fitting formulas were provided to explain how the muzzle velocity and the recoil force change with the vent opening time and location.Finally,a better vent venting opening time corresponding to the vent location was proposed.The findings should make an important contribution to the field of engineering applications of the RAVEN.
基金support from "973 Project" (Contract No. 2010CB226706)
文摘Eight casing failure modes and 32 risk factors in oil and gas wells are given in this paper. According to the quantitative analysis of the influence degree and occurrence probability of risk factors, the Borda counts for failure modes are obtained with the Borda method. The risk indexes of failure modes are derived from the Borda matrix. Based on the support vector machine (SVM), a casing life prediction model is established. In the prediction model, eight risk indexes are defined as input vectors and casing life is defined as the output vector. The ideal model parameters are determined with the training set from 19 wells with casing failure. The casing life prediction software is developed with the SVM model as a predictor. The residual life of 60 wells with casing failure is predicted with the software, and then compared with the actual casing life. The comparison results show that the casing life prediction software with the SVM model has high accuracy.
基金supported by the Taishan Scholars (No.ts201712003)。
文摘Nitrogen dioxide(NO_(2))poses a critical potential risk to environmental quality and public health.A reliable machine learning(ML)forecasting framework will be useful to provide valuable information to support government decision-making.Based on the data from1609 air quality monitors across China from 2014-2020,this study designed an ensemble ML model by integrating multiple types of spatial-temporal variables and three sub-models for time-sensitive prediction over a wide range.The ensemble ML model incorporates a residual connection to the gated recurrent unit(GRU)network and adopts the advantage of Transformer,extreme gradient boosting(XGBoost)and GRU with residual connection network,resulting in a 4.1%±1.0%lower root mean square error over XGBoost for the test results.The ensemble model shows great prediction performance,with coefficient of determination of 0.91,0.86,and 0.77 for 1-hr,3-hr,and 24-hr averages for the test results,respectively.In particular,this model has achieved excellent performance with low spatial uncertainty in Central,East,and North China,the major site-dense zones.Through the interpretability analysis based on the Shapley value for different temporal resolutions,we found that the contribution of atmospheric chemical processes is more important for hourly predictions compared with the daily scale predictions,while the impact of meteorological conditions would be ever-prominent for the latter.Compared with existing models for different spatiotemporal scales,the present model can be implemented at any air quality monitoring station across China to facilitate achieving rapid and dependable forecast of NO_(2),which will help developing effective control policies.
基金Project supported by the National Natural Science Foundation of China (Grant Nos 40575036 and 40325015).Acknowledgement The authors thank Drs Zhang Pei-Qun and Bao Ming very much for their valuable comments on the present paper.
文摘In this paper, an analogue correction method of errors (ACE) based on a complicated atmospheric model is further developed and applied to numerical weather prediction (NWP). The analysis shows that the ACE can effectively reduce model errors by combining the statistical analogue method with the dynamical model together in order that the information of plenty of historical data is utilized in the current complicated NWP model, Furthermore, in the ACE, the differences of the similarities between different historical analogues and the current initial state are considered as the weights for estimating model errors. The results of daily, decad and monthly prediction experiments on a complicated T63 atmospheric model show that the performance of the ACE by correcting model errors based on the estimation of the errors of 4 historical analogue predictions is not only better than that of the scheme of only introducing the correction of the errors of every single analogue prediction, but is also better than that of the T63 model.