In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not...In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not just at predicting geophysical logging curve values but also innovatively mitigate hydrocarbon depletion observed in geochemical logging.Through a rigorous assessment,we explore the efficacy of eight regression models,bifurcated into linear and nonlinear groups,to accommodate the multifaceted nature of geological datasets.Our linear model suite encompasses the Standard Equation,Ridge Regression,Least Absolute Shrinkage and Selection Operator,and Elastic Net,each presenting distinct advantages.The Standard Equation serves as a foundational benchmark,whereas Ridge Regression implements penalty terms to counteract overfitting,thus bolstering model robustness in the presence of multicollinearity.The Least Absolute Shrinkage and Selection Operator for variable selection functions to streamline models,enhancing their interpretability,while Elastic Net amalgamates the merits of Ridge Regression and Least Absolute Shrinkage and Selection Operator,offering a harmonized solution to model complexity and comprehensibility.On the nonlinear front,Gradient Descent,Kernel Ridge Regression,Support Vector Regression,and Piecewise Function-Fitting methods introduce innovative approaches.Gradient Descent assures computational efficiency in optimizing solutions,Kernel Ridge Regression leverages the kernel trick to navigate nonlinear patterns,and Support Vector Regression is proficient in forecasting extremities,pivotal for exploration risk assessment.The Piecewise Function-Fitting approach,tailored for geological data,facilitates adaptable modeling of variable interrelations,accommodating abrupt data trend shifts.Our analysis identifies Ridge Regression,particularly when augmented by Piecewise Function-Fitting,as superior in recouping hydrocarbon losses,and underscoring its utility in resource quantification refinement.Meanwhile,Kernel Ridge Regression emerges as a noteworthy strategy in ameliorating porosity-logging curve prediction for well A,evidencing its aptness for intricate geological structures.This research attests to the scientific ascendancy and broad-spectrum relevance of these regression techniques over conventional methods while heralding new horizons for their deployment in the oil and gas sector.The insights garnered from these advanced modeling strategies are set to transform geological and engineering practices in hydrocarbon prediction,evaluation,and recovery.展开更多
In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluste...In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.展开更多
A multivariable regression analysis of the in-situ stress field, which considers the non-linear deformation behavior of faults in practical projects, is presented based on a newly developed three-dimensional displacem...A multivariable regression analysis of the in-situ stress field, which considers the non-linear deformation behavior of faults in practical projects, is presented based on a newly developed three-dimensional displacement discontinuity method (DDM) program. The Bar- ton-Bandis model and the Kulhaway model are adopted as the normal and the tangential deformation model of faults, respectively, where the Mohr-Coulomb failure criterion is satisfied. In practical projects, the values of the mechanical parameters of rock and faults are restricted in a bounded range for in-situ test, and the optimal mechanical parameters are obtained from this range by a loop. Comparing with the traditional finite element method (FEM), the DDM regression results are more accurate.展开更多
Breakwaters have been built throughout the centuries for the coastal protection and the port development,but changes occurred in their layout and criteria used for the design.Quarter circle breakwater(QBW)is a new typ...Breakwaters have been built throughout the centuries for the coastal protection and the port development,but changes occurred in their layout and criteria used for the design.Quarter circle breakwater(QBW)is a new type evolved having advantages of both caisson type and perforated type breakwaters.The present study extracts the effect of change in the percentage of perforations on the stable conditions of seaside perforated QBW by using various physical models.The results were graphically analyzed using dimensionless parameters and it was concluded that there is a reduction in dimensionless stability parameter with an increase in steepness of the wave and change in water depth to the height of breakwater structure.Multiple non–linear regression analysis was done and the equation for the best fit curve with a higher regression coefficient was obtained by using Excel statistical software—XLSTAT.展开更多
Objective: To solve the problem of parameter estimate in the regression analysis of non-random sample. Methods: Calculating residuals according to the regression function based on original data. Modifying residuals an...Objective: To solve the problem of parameter estimate in the regression analysis of non-random sample. Methods: Calculating residuals according to the regression function based on original data. Modifying residuals and correcting them with mean. Adding mean-corrected residuals on original response and bootstrapping them to get 1000 samples. Fitting regression functions of 1000 resampling samples and calculating the 2.5th percentile and 97.5th percentile of corresponding coefficient. Results: The interval estimates deriving from bootstrap method had more statistical significance than that from usual method. Conclusion: Bootstrapping a regression with residuals is a valid method for estimating parameter in regression analysis.展开更多
Because of the difficulty to obtain the traffic flow information of lanes at non-detector intersections in most metropolises of the world,based on the relationships between the lanes of signal-controlled intersections...Because of the difficulty to obtain the traffic flow information of lanes at non-detector intersections in most metropolises of the world,based on the relationships between the lanes of signal-controlled intersections,cluster analysis and stepwise regression are integrated to predict the traffic volume of lanes at non-detector isolated controlled intersections.First cluster analysis is used to cluster the lanes of non-detector isolated signal-controlled intersections and the lanes of all signal-controlled intersections with detectors.Then, by the results of cluster analysis,the traffic volume samples are selected randomly and stepwise regression is used to predict the traffic volume of lanes at non-detector isolated signal-controlled intersections.The method is tested by the traffic volume data of lanes of the road network of Nanjing city.The problem of predicting the traffic volume of lanes at non-detector isolated signal-controlled intersections was resolved and can be widely used in urban traffic flow guidance and urban traffic control in cities without enough intersections equipped with detectors.展开更多
[Objective] The research aimed to study the significant influence factors of the population variations of oriental fruit fly. [Method] Using stepwise regression analysis, the population variations law of oriental frui...[Objective] The research aimed to study the significant influence factors of the population variations of oriental fruit fly. [Method] Using stepwise regression analysis, the population variations law of oriental fruit fly in Jianshui County of Yunnan province and the meteorological factors that caused its occurrence were analyzed. And the regression model was built. Finally, the regression model was tested on the basis of the data in Jianshui County of Yunnan Province during 2004-2006.[Result] The main meteorological factors that influenced the occurrence of oriental fruit fly were relative humidity, the lowest monthly temperature and rainfall. [Conclusion] This study will provide certain reference for the prediction researches on the time, quantity and occurrence peak of oriental fruit fly.展开更多
The blast-induced ground vibration prediction using scaled distance regression analysis is one of the most popular methods employed by engineers for many decades. It uses the maximum charge per delay and distance of m...The blast-induced ground vibration prediction using scaled distance regression analysis is one of the most popular methods employed by engineers for many decades. It uses the maximum charge per delay and distance of monitoring as the major factors for predicting the peak particle velocity(PPV). It is established that the PPV is caused by the maximum charge per delay which varies with the distance of monitoring and site geology. While conducting a production blasting, the waves induced by blasting of different holes interfere destructively with each other, which may result in higher PPV than the predicted value with scaled distance regression analysis. This phenomenon of interference/superimposition of waves is not considered while using scaled distance regression analysis. In this paper, an attempt has been made to compare the predicted values of blast-induced ground vibration using multi-hole trial blasting with single-hole blasting in an opencast coal mine under the same geological condition. Further,the modified prediction equation for the multi-hole trial blasting was obtained using single-hole regression analysis. The error between predicted and actual values of multi-hole blast-induced ground vibration was found to be reduced by 8.5%.展开更多
This paper presents an analysis to forecast the loads of an isolated area where the history of load is not available or the history may not represent the realistic demand of electricity. The analysis is done through l...This paper presents an analysis to forecast the loads of an isolated area where the history of load is not available or the history may not represent the realistic demand of electricity. The analysis is done through linear regression and based on the identification of factors on which electrical load growth depends. To determine the identification factors, areas are selected whose histories of load growth rate known and the load growth deciding factors are similar to those of the isolated area. The proposed analysis is applied to an isolated area of Bangladesh, called Swandip where a past history of electrical load demand is not available and also there is no possibility of connecting the area with the main land grid system.展开更多
Near infrared reflectance spectroscopy (NIRS), a non-destructive measurement technique, was combined with partial least squares regression discrimiant analysis (PLS-DA) to discriminate the transgenic (TCTP and mi...Near infrared reflectance spectroscopy (NIRS), a non-destructive measurement technique, was combined with partial least squares regression discrimiant analysis (PLS-DA) to discriminate the transgenic (TCTP and mi166) and wild type (Zhonghua 11) rice. Furthermore, rice lines transformed with protein gene (OsTCTP) and regulation gene (Osmi166) were also discriminated by the NIRS method. The performances of PLS-DA in spectral ranges of 4 000-8 000 cm-1 and 4 000-10 000 cm-1 were compared to obtain the optimal spectral range. As a result, the transgenic and wild type rice were distinguished from each other in the range of 4 000-10 000 cm-1, and the correct classification rate was 100.0% in the validation test. The transgenic rice TCTP and mi166 were also distinguished from each other in the range of 4 000-10 000 cm-1, and the correct classification rate was also 100.0%. In conclusion, NIRS combined with PLS-DA can be used for the discrimination of transgenic rice.展开更多
In order to overcome the disadvantages of diagonal connection structures that are complex and for which it is difficult to derive the discriminant of the airflow directions of airways, we have applied a multiple regre...In order to overcome the disadvantages of diagonal connection structures that are complex and for which it is difficult to derive the discriminant of the airflow directions of airways, we have applied a multiple regression method to analyze the effect, of changing the rules of mine airflows, on the stability of a mine ventilation system. The amount of air ( Qj ) is determined for the major airway and an optimum regression equation was derived for Qi as a function of the independent variable ( Ri ), i.e., the venti- lation resistance between different airways. Therefore, corresponding countermeasures are proposed according to the changes in airflows. The calculated results agree very well with our practical situation, indicating that multiple regression analysis is simple, quick and practical and is therefore an effective method to analyze the stability of mine ventilation systems.展开更多
This study aims to extend the multivariate adaptive regression splines(MARS)-Monte Carlo simulation(MCS) method for reliability analysis of slopes in spatially variable soils. This approach is used to explore the infl...This study aims to extend the multivariate adaptive regression splines(MARS)-Monte Carlo simulation(MCS) method for reliability analysis of slopes in spatially variable soils. This approach is used to explore the influences of the multiscale spatial variability of soil properties on the probability of failure(P_f) of the slopes. In the proposed approach, the relationship between the factor of safety and the soil strength parameters characterized with spatial variability is approximated by the MARS, with the aid of Karhunen-Loeve expansion. MCS is subsequently performed on the established MARS model to evaluate Pf.Finally, a nominally homogeneous cohesive-frictional slope and a heterogeneous cohesive slope, which are both characterized with different spatial variabilities, are utilized to illustrate the proposed approach.Results showed that the proposed approach can estimate the P_f of the slopes efficiently in spatially variable soils with sufficient accuracy. Moreover, the approach is relatively robust to the influence of different statistics of soil properties, thereby making it an effective and practical tool for addressing slope reliability problems concerning time-consuming deterministic stability models with low levels of P_f.Furthermore, disregarding the multiscale spatial variability of soil properties can overestimate or underestimate the P_f. Although the difference is small in general, the multiscale spatial variability of the soil properties must still be considered in the reliability analysis of heterogeneous slopes, especially for those highly related to cost effective and accurate designs.展开更多
In the spectral analysis of laser-induced breakdown spectroscopy,abundant characteristic spectral lines and severe interference information exist simultaneously in the original spectral data.Here,a feature selection m...In the spectral analysis of laser-induced breakdown spectroscopy,abundant characteristic spectral lines and severe interference information exist simultaneously in the original spectral data.Here,a feature selection method called recursive feature elimination based on ridge regression(Ridge-RFE)for the original spectral data is recommended to make full use of the valid information of spectra.In the Ridge-RFE method,the absolute value of the ridge regression coefficient was used as a criterion to screen spectral characteristic,the feature with the absolute value of minimum weight in the input subset features was removed by recursive feature elimination(RFE),and the selected features were used as inputs of the partial least squares regression(PLS)model.The Ridge-RFE method based PLS model was used to measure the Fe,Si,Mg,Cu,Zn and Mn for 51 aluminum alloy samples,and the results showed that the root mean square error of prediction decreased greatly compared to the PLS model with full spectrum as input.The overall results demonstrate that the Ridge-RFE method is more efficient to extract the redundant features,make PLS model for better quantitative analysis results and improve model generalization ability.展开更多
A combined model based on principal components analysis (PCA) and generalized regression neural network (GRNN) was adopted to forecast electricity price in day-ahead electricity market. PCA was applied to mine the mai...A combined model based on principal components analysis (PCA) and generalized regression neural network (GRNN) was adopted to forecast electricity price in day-ahead electricity market. PCA was applied to mine the main influence on day-ahead price, avoiding the strong correlation between the input factors that might influence electricity price, such as the load of the forecasting hour, other history loads and prices, weather and temperature; then GRNN was employed to forecast electricity price according to the main information extracted by PCA. To prove the efficiency of the combined model, a case from PJM (Pennsylvania-New Jersey-Maryland) day-ahead electricity market was evaluated. Compared to back-propagation (BP) neural network and standard GRNN, the combined method reduces the mean absolute percentage error about 3%.展开更多
Some parameters, such as assimilable organic carbon(AOC), chloramine residual, water temperature, and water residence time, were measured in drinking water from distribution systems in a northern city of China. The me...Some parameters, such as assimilable organic carbon(AOC), chloramine residual, water temperature, and water residence time, were measured in drinking water from distribution systems in a northern city of China. The measurement results illustrate that when chloramine residual is more than 0.3 mg/L or AOC content is below 50 μg/L, the biological stability of drinking water can be controlled. Both chloramine residual and AOC have a good relationship with Heterotrophic Plate Counts(HPC)(log value), the correlation coefficient was -0.64 and 0.33, respectively. By regression analysis of the survey data, a statistical equation is presented and it is concluded that disinfectant residual exerts the strongest influence on bacterial growth and AOC is a suitable index to assess the biological stability in the drinking water.展开更多
With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistica...With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.展开更多
The typical model, which involves the measures: support, confidence, and interest, is often adapted to mining association rules. In the model, the related parameters are usually chosen by experience; consequently, th...The typical model, which involves the measures: support, confidence, and interest, is often adapted to mining association rules. In the model, the related parameters are usually chosen by experience; consequently, the number of useful rules is hard to estimate. If the number is too large, we cannot effectively extract the meaningful rules. This paper analyzes the meanings of the parameters and designs a variety of equations between the number of rules and the parameters by using regression method. Finally, we experimentally obtain a preferable regression equation. This paper uses multiple correlation coeficients to test the fitting efiects of the equations and uses significance test to verify whether the coeficients of parameters are significantly zero or not. The regression equation that has a larger multiple correlation coeficient will be chosen as the optimally fitted equation. With the selected optimal equation, we can predict the number of rules under the given parameters and further optimize the choice of the three parameters and determine their ranges of values.展开更多
Accurate cost estimation at the early stage of a construction project is key factor in a project’s success. But it is difficult to quickly and accurately estimate construction costs at the planning stage, when drawin...Accurate cost estimation at the early stage of a construction project is key factor in a project’s success. But it is difficult to quickly and accurately estimate construction costs at the planning stage, when drawings, documentation and the like are still incomplete. As such, various techniques have been applied to accurately estimate construction costs at an early stage, when project information is limited. While the various techniques have their pros and cons, there has been little effort made to determine the best technique in terms of cost estimating performance. The objective of this research is to compare the accuracy of three estimating techniques (regression analysis (RA), neural network (NN), and support vector machine techniques (SVM)) by performing estimations of construction costs. By comparing the accuracy of these techniques using historical cost data, it was found that NN model showed more accurate estimation results than the RA and SVM models. Consequently, it is determined that NN model is most suitable for estimating the cost of school building projects.展开更多
The importance of detecting heteroscedasticity in regression analysis is widely recognized because efficient inference for the regression function requires that heteroscedasticity should be taken into account. In this...The importance of detecting heteroscedasticity in regression analysis is widely recognized because efficient inference for the regression function requires that heteroscedasticity should be taken into account. In this paper, a simple test for heteroscedasticity is proposed in nonparametric regression based on residual analysis. Furthermore, some simulations with a comparison with Dette and Munk's method are conducted to evaluate the performance of the proposed test. The results demonstrate that the method in this paper performs quite satisfactorily and is much more powerful than Dette and Munk's method in some cases.展开更多
Estimating the intensity of outbursts of coal and gas is important as the intensity and frequency of outbursts of coal and gas tend to increase in deep mining. Fully understanding the major factors contributing to coa...Estimating the intensity of outbursts of coal and gas is important as the intensity and frequency of outbursts of coal and gas tend to increase in deep mining. Fully understanding the major factors contributing to coal and gas outbursts is significant in the evaluation of the intensity of the outburst. In this paper, we discuss the correlation between these major factors and the intensity of the outburst using Analysis of Variance(ANOVA) and Contingency Table Analysis(CTA). Regression analysis is used to evaluate the impact of these major factors on the intensity of outbursts based on physical experiments. Based on the evaluation, two simple models in terms of multiple linear and nonlinear regression were constructed for the prediction of the intensity of the outburst. The results show that the gas pressure and initial moisture in the coal mass could be the most significant factors compared to the weakest factor-porosity. The P values from Fisher's exact test in CTA are: moisture(0.019), geostress(0.290), porosity(0.650), and gas pressure(0.031). P values from ANOVA are moisture(0.094), geostress(0.077), porosity(0.420), and gas pressure(0.051). Furthermore, the multiple nonlinear regression model(RMSE: 3.870) is more accurate than the linear regression model(RMSE: 4.091).展开更多
文摘In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not just at predicting geophysical logging curve values but also innovatively mitigate hydrocarbon depletion observed in geochemical logging.Through a rigorous assessment,we explore the efficacy of eight regression models,bifurcated into linear and nonlinear groups,to accommodate the multifaceted nature of geological datasets.Our linear model suite encompasses the Standard Equation,Ridge Regression,Least Absolute Shrinkage and Selection Operator,and Elastic Net,each presenting distinct advantages.The Standard Equation serves as a foundational benchmark,whereas Ridge Regression implements penalty terms to counteract overfitting,thus bolstering model robustness in the presence of multicollinearity.The Least Absolute Shrinkage and Selection Operator for variable selection functions to streamline models,enhancing their interpretability,while Elastic Net amalgamates the merits of Ridge Regression and Least Absolute Shrinkage and Selection Operator,offering a harmonized solution to model complexity and comprehensibility.On the nonlinear front,Gradient Descent,Kernel Ridge Regression,Support Vector Regression,and Piecewise Function-Fitting methods introduce innovative approaches.Gradient Descent assures computational efficiency in optimizing solutions,Kernel Ridge Regression leverages the kernel trick to navigate nonlinear patterns,and Support Vector Regression is proficient in forecasting extremities,pivotal for exploration risk assessment.The Piecewise Function-Fitting approach,tailored for geological data,facilitates adaptable modeling of variable interrelations,accommodating abrupt data trend shifts.Our analysis identifies Ridge Regression,particularly when augmented by Piecewise Function-Fitting,as superior in recouping hydrocarbon losses,and underscoring its utility in resource quantification refinement.Meanwhile,Kernel Ridge Regression emerges as a noteworthy strategy in ameliorating porosity-logging curve prediction for well A,evidencing its aptness for intricate geological structures.This research attests to the scientific ascendancy and broad-spectrum relevance of these regression techniques over conventional methods while heralding new horizons for their deployment in the oil and gas sector.The insights garnered from these advanced modeling strategies are set to transform geological and engineering practices in hydrocarbon prediction,evaluation,and recovery.
文摘In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.
基金financially supported by the Western Transport Technical Project of the Ministry of Transport, China (No. 2009318000046)
文摘A multivariable regression analysis of the in-situ stress field, which considers the non-linear deformation behavior of faults in practical projects, is presented based on a newly developed three-dimensional displacement discontinuity method (DDM) program. The Bar- ton-Bandis model and the Kulhaway model are adopted as the normal and the tangential deformation model of faults, respectively, where the Mohr-Coulomb failure criterion is satisfied. In practical projects, the values of the mechanical parameters of rock and faults are restricted in a bounded range for in-situ test, and the optimal mechanical parameters are obtained from this range by a loop. Comparing with the traditional finite element method (FEM), the DDM regression results are more accurate.
基金The authors are thankful to Director,NITK Surathkal and the Head of Applied Mechanics Department,NITK Surathkal for all the support and encouragement in the preparation of this paper.
文摘Breakwaters have been built throughout the centuries for the coastal protection and the port development,but changes occurred in their layout and criteria used for the design.Quarter circle breakwater(QBW)is a new type evolved having advantages of both caisson type and perforated type breakwaters.The present study extracts the effect of change in the percentage of perforations on the stable conditions of seaside perforated QBW by using various physical models.The results were graphically analyzed using dimensionless parameters and it was concluded that there is a reduction in dimensionless stability parameter with an increase in steepness of the wave and change in water depth to the height of breakwater structure.Multiple non–linear regression analysis was done and the equation for the best fit curve with a higher regression coefficient was obtained by using Excel statistical software—XLSTAT.
文摘Objective: To solve the problem of parameter estimate in the regression analysis of non-random sample. Methods: Calculating residuals according to the regression function based on original data. Modifying residuals and correcting them with mean. Adding mean-corrected residuals on original response and bootstrapping them to get 1000 samples. Fitting regression functions of 1000 resampling samples and calculating the 2.5th percentile and 97.5th percentile of corresponding coefficient. Results: The interval estimates deriving from bootstrap method had more statistical significance than that from usual method. Conclusion: Bootstrapping a regression with residuals is a valid method for estimating parameter in regression analysis.
基金The National Natural Science Foundation of China(No.50378016).
文摘Because of the difficulty to obtain the traffic flow information of lanes at non-detector intersections in most metropolises of the world,based on the relationships between the lanes of signal-controlled intersections,cluster analysis and stepwise regression are integrated to predict the traffic volume of lanes at non-detector isolated controlled intersections.First cluster analysis is used to cluster the lanes of non-detector isolated signal-controlled intersections and the lanes of all signal-controlled intersections with detectors.Then, by the results of cluster analysis,the traffic volume samples are selected randomly and stepwise regression is used to predict the traffic volume of lanes at non-detector isolated signal-controlled intersections.The method is tested by the traffic volume data of lanes of the road network of Nanjing city.The problem of predicting the traffic volume of lanes at non-detector isolated signal-controlled intersections was resolved and can be widely used in urban traffic flow guidance and urban traffic control in cities without enough intersections equipped with detectors.
基金Supported by National Key Technology R&D Program in the11th Five Year Plan of China(2006BAD10A14)~~
文摘[Objective] The research aimed to study the significant influence factors of the population variations of oriental fruit fly. [Method] Using stepwise regression analysis, the population variations law of oriental fruit fly in Jianshui County of Yunnan province and the meteorological factors that caused its occurrence were analyzed. And the regression model was built. Finally, the regression model was tested on the basis of the data in Jianshui County of Yunnan Province during 2004-2006.[Result] The main meteorological factors that influenced the occurrence of oriental fruit fly were relative humidity, the lowest monthly temperature and rainfall. [Conclusion] This study will provide certain reference for the prediction researches on the time, quantity and occurrence peak of oriental fruit fly.
文摘The blast-induced ground vibration prediction using scaled distance regression analysis is one of the most popular methods employed by engineers for many decades. It uses the maximum charge per delay and distance of monitoring as the major factors for predicting the peak particle velocity(PPV). It is established that the PPV is caused by the maximum charge per delay which varies with the distance of monitoring and site geology. While conducting a production blasting, the waves induced by blasting of different holes interfere destructively with each other, which may result in higher PPV than the predicted value with scaled distance regression analysis. This phenomenon of interference/superimposition of waves is not considered while using scaled distance regression analysis. In this paper, an attempt has been made to compare the predicted values of blast-induced ground vibration using multi-hole trial blasting with single-hole blasting in an opencast coal mine under the same geological condition. Further,the modified prediction equation for the multi-hole trial blasting was obtained using single-hole regression analysis. The error between predicted and actual values of multi-hole blast-induced ground vibration was found to be reduced by 8.5%.
文摘This paper presents an analysis to forecast the loads of an isolated area where the history of load is not available or the history may not represent the realistic demand of electricity. The analysis is done through linear regression and based on the identification of factors on which electrical load growth depends. To determine the identification factors, areas are selected whose histories of load growth rate known and the load growth deciding factors are similar to those of the isolated area. The proposed analysis is applied to an isolated area of Bangladesh, called Swandip where a past history of electrical load demand is not available and also there is no possibility of connecting the area with the main land grid system.
基金supported by the projects under the Innovation Team of the Safety Standards and Testing Technology for Agricultural Products of Zhejiang Province, China (Grant No.2010R50028)the National Key Technologies R&D Program of China during the 11th Five-Year Plan Period (Grant No.2006BAK02A18)
文摘Near infrared reflectance spectroscopy (NIRS), a non-destructive measurement technique, was combined with partial least squares regression discrimiant analysis (PLS-DA) to discriminate the transgenic (TCTP and mi166) and wild type (Zhonghua 11) rice. Furthermore, rice lines transformed with protein gene (OsTCTP) and regulation gene (Osmi166) were also discriminated by the NIRS method. The performances of PLS-DA in spectral ranges of 4 000-8 000 cm-1 and 4 000-10 000 cm-1 were compared to obtain the optimal spectral range. As a result, the transgenic and wild type rice were distinguished from each other in the range of 4 000-10 000 cm-1, and the correct classification rate was 100.0% in the validation test. The transgenic rice TCTP and mi166 were also distinguished from each other in the range of 4 000-10 000 cm-1, and the correct classification rate was also 100.0%. In conclusion, NIRS combined with PLS-DA can be used for the discrimination of transgenic rice.
基金Project F010206 supported by the National Natural Science Foundation of China
文摘In order to overcome the disadvantages of diagonal connection structures that are complex and for which it is difficult to derive the discriminant of the airflow directions of airways, we have applied a multiple regression method to analyze the effect, of changing the rules of mine airflows, on the stability of a mine ventilation system. The amount of air ( Qj ) is determined for the major airway and an optimum regression equation was derived for Qi as a function of the independent variable ( Ri ), i.e., the venti- lation resistance between different airways. Therefore, corresponding countermeasures are proposed according to the changes in airflows. The calculated results agree very well with our practical situation, indicating that multiple regression analysis is simple, quick and practical and is therefore an effective method to analyze the stability of mine ventilation systems.
基金supported by The Hong Kong Polytechnic University through the project RU3Ythe Research Grant Council through the project PolyU 5128/13E+1 种基金National Natural Science Foundation of China(Grant No.51778313)Cooperative Innovation Center of Engineering Construction and Safety in Shangdong Blue Economic Zone
文摘This study aims to extend the multivariate adaptive regression splines(MARS)-Monte Carlo simulation(MCS) method for reliability analysis of slopes in spatially variable soils. This approach is used to explore the influences of the multiscale spatial variability of soil properties on the probability of failure(P_f) of the slopes. In the proposed approach, the relationship between the factor of safety and the soil strength parameters characterized with spatial variability is approximated by the MARS, with the aid of Karhunen-Loeve expansion. MCS is subsequently performed on the established MARS model to evaluate Pf.Finally, a nominally homogeneous cohesive-frictional slope and a heterogeneous cohesive slope, which are both characterized with different spatial variabilities, are utilized to illustrate the proposed approach.Results showed that the proposed approach can estimate the P_f of the slopes efficiently in spatially variable soils with sufficient accuracy. Moreover, the approach is relatively robust to the influence of different statistics of soil properties, thereby making it an effective and practical tool for addressing slope reliability problems concerning time-consuming deterministic stability models with low levels of P_f.Furthermore, disregarding the multiscale spatial variability of soil properties can overestimate or underestimate the P_f. Although the difference is small in general, the multiscale spatial variability of the soil properties must still be considered in the reliability analysis of heterogeneous slopes, especially for those highly related to cost effective and accurate designs.
基金supported by National Key Research and Development Program of China(No.2016YFF0102502)the Key Research Program of Frontier Sciences,CAS(No.QYZDJ-SSW-JSC037)the Youth Innovation Promotion Association,CAS,Liao Ning Revitalization Talents Program(No.XLYC1807110)。
文摘In the spectral analysis of laser-induced breakdown spectroscopy,abundant characteristic spectral lines and severe interference information exist simultaneously in the original spectral data.Here,a feature selection method called recursive feature elimination based on ridge regression(Ridge-RFE)for the original spectral data is recommended to make full use of the valid information of spectra.In the Ridge-RFE method,the absolute value of the ridge regression coefficient was used as a criterion to screen spectral characteristic,the feature with the absolute value of minimum weight in the input subset features was removed by recursive feature elimination(RFE),and the selected features were used as inputs of the partial least squares regression(PLS)model.The Ridge-RFE method based PLS model was used to measure the Fe,Si,Mg,Cu,Zn and Mn for 51 aluminum alloy samples,and the results showed that the root mean square error of prediction decreased greatly compared to the PLS model with full spectrum as input.The overall results demonstrate that the Ridge-RFE method is more efficient to extract the redundant features,make PLS model for better quantitative analysis results and improve model generalization ability.
基金Project(70671039) supported by the National Natural Science Foundation of China
文摘A combined model based on principal components analysis (PCA) and generalized regression neural network (GRNN) was adopted to forecast electricity price in day-ahead electricity market. PCA was applied to mine the main influence on day-ahead price, avoiding the strong correlation between the input factors that might influence electricity price, such as the load of the forecasting hour, other history loads and prices, weather and temperature; then GRNN was employed to forecast electricity price according to the main information extracted by PCA. To prove the efficiency of the combined model, a case from PJM (Pennsylvania-New Jersey-Maryland) day-ahead electricity market was evaluated. Compared to back-propagation (BP) neural network and standard GRNN, the combined method reduces the mean absolute percentage error about 3%.
基金Foundation item: The National High Tech Research and Development Program(863) of China(No. 2002AA601140) and the National Natural Science Foundation of China(No. 50238020)
文摘Some parameters, such as assimilable organic carbon(AOC), chloramine residual, water temperature, and water residence time, were measured in drinking water from distribution systems in a northern city of China. The measurement results illustrate that when chloramine residual is more than 0.3 mg/L or AOC content is below 50 μg/L, the biological stability of drinking water can be controlled. Both chloramine residual and AOC have a good relationship with Heterotrophic Plate Counts(HPC)(log value), the correlation coefficient was -0.64 and 0.33, respectively. By regression analysis of the survey data, a statistical equation is presented and it is concluded that disinfectant residual exerts the strongest influence on bacterial growth and AOC is a suitable index to assess the biological stability in the drinking water.
基金founded by the National Natural Science Foundation of China(81202283,81473070,81373102 and81202267)Key Grant of Natural Science Foundation of the Jiangsu Higher Education Institutions of China(10KJA330034 and11KJA330001)+1 种基金the Research Fund for the Doctoral Program of Higher Education of China(20113234110002)the Priority Academic Program for the Development of Jiangsu Higher Education Institutions(Public Health and Preventive Medicine)
文摘With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.
基金supported by the National Natural Science Foundation of China (No. J07240003, No. 60773084, No. 60603023)National Research Fund for the Doctoral Program of Higher Education of China (No. 20070151009)
文摘The typical model, which involves the measures: support, confidence, and interest, is often adapted to mining association rules. In the model, the related parameters are usually chosen by experience; consequently, the number of useful rules is hard to estimate. If the number is too large, we cannot effectively extract the meaningful rules. This paper analyzes the meanings of the parameters and designs a variety of equations between the number of rules and the parameters by using regression method. Finally, we experimentally obtain a preferable regression equation. This paper uses multiple correlation coeficients to test the fitting efiects of the equations and uses significance test to verify whether the coeficients of parameters are significantly zero or not. The regression equation that has a larger multiple correlation coeficient will be chosen as the optimally fitted equation. With the selected optimal equation, we can predict the number of rules under the given parameters and further optimize the choice of the three parameters and determine their ranges of values.
文摘Accurate cost estimation at the early stage of a construction project is key factor in a project’s success. But it is difficult to quickly and accurately estimate construction costs at the planning stage, when drawings, documentation and the like are still incomplete. As such, various techniques have been applied to accurately estimate construction costs at an early stage, when project information is limited. While the various techniques have their pros and cons, there has been little effort made to determine the best technique in terms of cost estimating performance. The objective of this research is to compare the accuracy of three estimating techniques (regression analysis (RA), neural network (NN), and support vector machine techniques (SVM)) by performing estimations of construction costs. By comparing the accuracy of these techniques using historical cost data, it was found that NN model showed more accurate estimation results than the RA and SVM models. Consequently, it is determined that NN model is most suitable for estimating the cost of school building projects.
基金the National Natural Science Foundation of China (10531030)
文摘The importance of detecting heteroscedasticity in regression analysis is widely recognized because efficient inference for the regression function requires that heteroscedasticity should be taken into account. In this paper, a simple test for heteroscedasticity is proposed in nonparametric regression based on residual analysis. Furthermore, some simulations with a comparison with Dette and Munk's method are conducted to evaluate the performance of the proposed test. The results demonstrate that the method in this paper performs quite satisfactorily and is much more powerful than Dette and Munk's method in some cases.
基金provided by the Natural Science Foundation Project(Key)of Chongqing(No.cstc2013jjB0012)the National Natural Science Foundation of China(No.51434003)the National Natural Science Foundation of China(No.51474040)
文摘Estimating the intensity of outbursts of coal and gas is important as the intensity and frequency of outbursts of coal and gas tend to increase in deep mining. Fully understanding the major factors contributing to coal and gas outbursts is significant in the evaluation of the intensity of the outburst. In this paper, we discuss the correlation between these major factors and the intensity of the outburst using Analysis of Variance(ANOVA) and Contingency Table Analysis(CTA). Regression analysis is used to evaluate the impact of these major factors on the intensity of outbursts based on physical experiments. Based on the evaluation, two simple models in terms of multiple linear and nonlinear regression were constructed for the prediction of the intensity of the outburst. The results show that the gas pressure and initial moisture in the coal mass could be the most significant factors compared to the weakest factor-porosity. The P values from Fisher's exact test in CTA are: moisture(0.019), geostress(0.290), porosity(0.650), and gas pressure(0.031). P values from ANOVA are moisture(0.094), geostress(0.077), porosity(0.420), and gas pressure(0.051). Furthermore, the multiple nonlinear regression model(RMSE: 3.870) is more accurate than the linear regression model(RMSE: 4.091).