Based on the principle of Bayesian discriminant analysis, we established a model of Bayesian discriminant analysis for predicting coal and gas outbursts. We selected five major indices which affect outbursts, i.e., in...Based on the principle of Bayesian discriminant analysis, we established a model of Bayesian discriminant analysis for predicting coal and gas outbursts. We selected five major indices which affect outbursts, i.e., initial speed of methane diffusion, a consistent coal coefficient, gas pressure, destructive style of coal and mining depth, as discriminating factors of the model. In our model, we divided the type of coal and gas outbursts into four grades regarded as four normal populations. We then obtained the corresponding discriminant functions through training a set of data from engineering examples as learning samples and evaluated their criteria by a back substitution method to verify the optimal properties of the model. Finally, we applied the model to the prediction of coal and gas outbursts in the Yunnan Enhong Mine. Our results coincided completely with the actual situation. These results show that a model of Bayesian discriminant analysis has excellent recognition performance, high prediction accuracy and a low error rate and is an effective method to predict coal and gas outbursts.展开更多
The capability of accurately predicting mineralogical brittleness index (BI) from basic suites of well logs is desirable as it provides a useful indicator of the fracability of tight formations.Measuring mineralogical...The capability of accurately predicting mineralogical brittleness index (BI) from basic suites of well logs is desirable as it provides a useful indicator of the fracability of tight formations.Measuring mineralogical components in rocks is expensive and time consuming.However,the basic well log curves are not well correlated with BI so correlation-based,machine-learning methods are not able to derive highly accurate BI predictions using such data.A correlation-free,optimized data-matching algorithm is configured to predict BI on a supervised basis from well log and core data available from two published wells in the Lower Barnett Shale Formation (Texas).This transparent open box (TOB) algorithm matches data records by calculating the sum of squared errors between their variables and selecting the best matches as those with the minimum squared errors.It then applies optimizers to adjust weights applied to individual variable errors to minimize the root mean square error (RMSE)between calculated and predicted (BI).The prediction accuracy achieved by TOB using just five well logs (Gr,ρb,Ns,Rs,Dt) to predict BI is dependent on the density of data records sampled.At a sampling density of about one sample per 0.5 ft BI is predicted with RMSE~0.056 and R^(2)~0.790.At a sampling density of about one sample per0.1 ft BI is predicted with RMSE~0.008 and R^(2)~0.995.Adding a stratigraphic height index as an additional (sixth)input variable method improves BI prediction accuracy to RMSE~0.003 and R^(2)~0.999 for the two wells with only 1 record in 10,000 yielding a BI prediction error of>±0.1.The model has the potential to be applied in an unsupervised basis to predict BI from basic well log data in surrounding wells lacking mineralogical measurements but with similar lithofacies and burial histories.The method could also be extended to predict elastic rock properties in and seismic attributes from wells and seismic data to improve the precision of brittleness index and fracability mapping spatially.展开更多
Regression random forest is becoming a widely-used machine learning technique for spatial prediction that shows competitive prediction performance in various geoscience fields.Like other popular machine learning metho...Regression random forest is becoming a widely-used machine learning technique for spatial prediction that shows competitive prediction performance in various geoscience fields.Like other popular machine learning methods for spatial prediction,regression random forest does not exactly honor the response variable’s measured values at sampled locations.However,competitor methods such as regression-kriging perfectly fit the response variable’s observed values at sampled locations by construction.Exactly matching the response variable’s measured values at sampled locations is often desirable in many geoscience applications.This paper presents a new approach ensuring that regression random forest perfectly matches the response variable’s observed values at sampled locations.The main idea consists of using the principal component analysis to create an orthogonal representation of the ensemble of regression tree predictors resulting from the traditional regression random forest.Then,the exact conditioning problem is reformulated as a Bayes-linear-Gauss problem on principal component scores.This problem has an analytical solution making it easy to perform Monte Carlo sampling of new principal component scores and then reconstruct regression tree predictors that perfectly match the response variable’s observed values at sampled locations.The reconstructed regression tree predictors’average also precisely matches the response variable’s measured values at sampled locations by construction.The proposed method’s effectiveness is illustrated on the one hand using a synthetic dataset where the ground-truth is available everywhere within the study region,and on the other hand,using a real dataset comprising southwest England’s geochemical concentration data.It is compared with the regression-kriging and the traditional regression random forest.It appears that the proposed method can perfectly fit the response variable’s measured values at sampled locations while achieving good out of sample predictive performance comparatively to regression-kriging and traditional regression random forest.展开更多
With the urbanization,urban transportation has become a key factor restricting the development of a city.In a big city,it is important to improve the efficiency of urban transportation.The key to realize short-term tr...With the urbanization,urban transportation has become a key factor restricting the development of a city.In a big city,it is important to improve the efficiency of urban transportation.The key to realize short-term traffic flow prediction is to learn its complex spatial correlation,temporal correlation and randomness of traffic flow.In this paper,the convolution neural network(CNN)is proposed to deal with spatial correlation among different regions,considering that the large urban areas leads to a relatively deep Network layer.First three gated recurrent unit(GRU)were used to deal with recent time dependence,daily period dependence and weekly period dependence.Considering that each historical period data to forecast the influence degree of the time period is different,three attention mechanism was taken into GRU.Second,a twolayer full connection network was applied to deal with the randomness of short-term flow combined with additional information such as weather data.Besides,the prediction model was established by combining these three modules.Furthermore,in order to verify the influence of spatial correlation on prediction model,an urban functional area identification model was introduced to identify different functional regions.Finally,the proposed model was validated based on the history of New York City taxi order data and reptiles for weather data.The experimental results show that the prediction precision of our model is obviously superior to the mainstream of the existing prediction methods.展开更多
In recent decades, Urban Heat Island Effects have become more pronounced and more widely examined. Despite great technological advances, our current societies still experience great spatial disparity in urban forest a...In recent decades, Urban Heat Island Effects have become more pronounced and more widely examined. Despite great technological advances, our current societies still experience great spatial disparity in urban forest access. Urban Heat Island Effects are measurable phenomenon that are being experienced by the world’s most urbanized areas, including increased summer high temperatures and lower evapotranspiration from having impervious surfaces instead of vegetation and trees. Tree canopy cover is our natural mitigation tool that absorbs sunlight for photosynthesis, protects humans from incoming radiation, and releases cooling moisture into the air. Unfortunately, urban areas typically have low levels of vegetation. Vulnerable urban communities are lower-income areas of inner cities with less access to heat protection like air conditioners. This study uses mean evapotranspiration levels to assess the variability of urban heat island effects across the state of Tennessee. Results show that increased developed land surface cover in Tennessee creates measurable changes in atmospheric evapotranspiration. As a result, the mean evapotranspiration levels in areas with less tree vegetation are significantly lower than the surrounding forested areas. Central areas of urban cities in Tennessee had lower mean evapotranspiration recordings than surrounding areas with less development. This work demonstrates the need for increased tree canopy coverage.展开更多
基金supported by the National Hi-tech Research and Development Program of China (No.2006BAK03B02-04) the New Century Excellent Talent Support Plan of Ministry of Education of China (No.NCET-06-0477)
文摘Based on the principle of Bayesian discriminant analysis, we established a model of Bayesian discriminant analysis for predicting coal and gas outbursts. We selected five major indices which affect outbursts, i.e., initial speed of methane diffusion, a consistent coal coefficient, gas pressure, destructive style of coal and mining depth, as discriminating factors of the model. In our model, we divided the type of coal and gas outbursts into four grades regarded as four normal populations. We then obtained the corresponding discriminant functions through training a set of data from engineering examples as learning samples and evaluated their criteria by a back substitution method to verify the optimal properties of the model. Finally, we applied the model to the prediction of coal and gas outbursts in the Yunnan Enhong Mine. Our results coincided completely with the actual situation. These results show that a model of Bayesian discriminant analysis has excellent recognition performance, high prediction accuracy and a low error rate and is an effective method to predict coal and gas outbursts.
文摘The capability of accurately predicting mineralogical brittleness index (BI) from basic suites of well logs is desirable as it provides a useful indicator of the fracability of tight formations.Measuring mineralogical components in rocks is expensive and time consuming.However,the basic well log curves are not well correlated with BI so correlation-based,machine-learning methods are not able to derive highly accurate BI predictions using such data.A correlation-free,optimized data-matching algorithm is configured to predict BI on a supervised basis from well log and core data available from two published wells in the Lower Barnett Shale Formation (Texas).This transparent open box (TOB) algorithm matches data records by calculating the sum of squared errors between their variables and selecting the best matches as those with the minimum squared errors.It then applies optimizers to adjust weights applied to individual variable errors to minimize the root mean square error (RMSE)between calculated and predicted (BI).The prediction accuracy achieved by TOB using just five well logs (Gr,ρb,Ns,Rs,Dt) to predict BI is dependent on the density of data records sampled.At a sampling density of about one sample per 0.5 ft BI is predicted with RMSE~0.056 and R^(2)~0.790.At a sampling density of about one sample per0.1 ft BI is predicted with RMSE~0.008 and R^(2)~0.995.Adding a stratigraphic height index as an additional (sixth)input variable method improves BI prediction accuracy to RMSE~0.003 and R^(2)~0.999 for the two wells with only 1 record in 10,000 yielding a BI prediction error of>±0.1.The model has the potential to be applied in an unsupervised basis to predict BI from basic well log data in surrounding wells lacking mineralogical measurements but with similar lithofacies and burial histories.The method could also be extended to predict elastic rock properties in and seismic attributes from wells and seismic data to improve the precision of brittleness index and fracability mapping spatially.
文摘Regression random forest is becoming a widely-used machine learning technique for spatial prediction that shows competitive prediction performance in various geoscience fields.Like other popular machine learning methods for spatial prediction,regression random forest does not exactly honor the response variable’s measured values at sampled locations.However,competitor methods such as regression-kriging perfectly fit the response variable’s observed values at sampled locations by construction.Exactly matching the response variable’s measured values at sampled locations is often desirable in many geoscience applications.This paper presents a new approach ensuring that regression random forest perfectly matches the response variable’s observed values at sampled locations.The main idea consists of using the principal component analysis to create an orthogonal representation of the ensemble of regression tree predictors resulting from the traditional regression random forest.Then,the exact conditioning problem is reformulated as a Bayes-linear-Gauss problem on principal component scores.This problem has an analytical solution making it easy to perform Monte Carlo sampling of new principal component scores and then reconstruct regression tree predictors that perfectly match the response variable’s observed values at sampled locations.The reconstructed regression tree predictors’average also precisely matches the response variable’s measured values at sampled locations by construction.The proposed method’s effectiveness is illustrated on the one hand using a synthetic dataset where the ground-truth is available everywhere within the study region,and on the other hand,using a real dataset comprising southwest England’s geochemical concentration data.It is compared with the regression-kriging and the traditional regression random forest.It appears that the proposed method can perfectly fit the response variable’s measured values at sampled locations while achieving good out of sample predictive performance comparatively to regression-kriging and traditional regression random forest.
基金the Natural Science Foundation of China grant61672128, 61702076the Fundamental Research Funds for the Central UniversitiesDUT18JC39.
文摘With the urbanization,urban transportation has become a key factor restricting the development of a city.In a big city,it is important to improve the efficiency of urban transportation.The key to realize short-term traffic flow prediction is to learn its complex spatial correlation,temporal correlation and randomness of traffic flow.In this paper,the convolution neural network(CNN)is proposed to deal with spatial correlation among different regions,considering that the large urban areas leads to a relatively deep Network layer.First three gated recurrent unit(GRU)were used to deal with recent time dependence,daily period dependence and weekly period dependence.Considering that each historical period data to forecast the influence degree of the time period is different,three attention mechanism was taken into GRU.Second,a twolayer full connection network was applied to deal with the randomness of short-term flow combined with additional information such as weather data.Besides,the prediction model was established by combining these three modules.Furthermore,in order to verify the influence of spatial correlation on prediction model,an urban functional area identification model was introduced to identify different functional regions.Finally,the proposed model was validated based on the history of New York City taxi order data and reptiles for weather data.The experimental results show that the prediction precision of our model is obviously superior to the mainstream of the existing prediction methods.
文摘In recent decades, Urban Heat Island Effects have become more pronounced and more widely examined. Despite great technological advances, our current societies still experience great spatial disparity in urban forest access. Urban Heat Island Effects are measurable phenomenon that are being experienced by the world’s most urbanized areas, including increased summer high temperatures and lower evapotranspiration from having impervious surfaces instead of vegetation and trees. Tree canopy cover is our natural mitigation tool that absorbs sunlight for photosynthesis, protects humans from incoming radiation, and releases cooling moisture into the air. Unfortunately, urban areas typically have low levels of vegetation. Vulnerable urban communities are lower-income areas of inner cities with less access to heat protection like air conditioners. This study uses mean evapotranspiration levels to assess the variability of urban heat island effects across the state of Tennessee. Results show that increased developed land surface cover in Tennessee creates measurable changes in atmospheric evapotranspiration. As a result, the mean evapotranspiration levels in areas with less tree vegetation are significantly lower than the surrounding forested areas. Central areas of urban cities in Tennessee had lower mean evapotranspiration recordings than surrounding areas with less development. This work demonstrates the need for increased tree canopy coverage.