From economy to political administrations, education to health, environment to human rights, many problems we met have gained a global importance in recent days. Existing state systems, political parties and nation st...From economy to political administrations, education to health, environment to human rights, many problems we met have gained a global importance in recent days. Existing state systems, political parties and nation states are not adequate for solving these problems in question effectively on their own. Not only governments and local authorities but also voluntary organizations based on completely voluntary activities have significant roles in solving these problems. Effective performance of voluntary organizations depends on increasing volunteer population. Individuals' attitudes or their perception of understanding volunteerism play an important role in their contributions to voluntary organizations. The aim of this study is to determine individuals' ways of perceiving volunteerism concept and their tendency towards it. Furthermore, differences between men and women's perception and attitudes towards volunteerism concept have been examined. For this purpose, a survey has been conducted over university students of bachelor's degree. Tendencies and attitudes towards volunteerism compared to gender differences have been tested via logistic regression method. Research results reveal that women take part in voluntary activities more than men and women perceive volunteerism as "a political position" while men perceive volunteerism as "a learning atmosphere and learning process".展开更多
A complex terrain and topography resulted in an enormous landslide-dammed area northeast of Afghanistan. Moreover, debris, rock avalanches, and landslides occurrences are the primary source of lakes created within the...A complex terrain and topography resulted in an enormous landslide-dammed area northeast of Afghanistan. Moreover, debris, rock avalanches, and landslides occurrences are the primary source of lakes created within the area. Recently, instances have increased because of the high displacement and mass movement by glacial and seismic activities. In this study, using GIS and R statistical software, we performed a logistic regression modeling in order to map and predict the probability of landslides-dammed occurrences. Totally, 361 lakes were mapped using Google Earth historical imagery. This total was divided into 253 (70%) lakes for modeling and 801 (30%) lakes for the model validation. They were randomly selected by creating a fishnet for the study area using Arc toolbox in GIS. Four independent variables that are mostly contributed to landslide-dammed occurrences consisting of slope angles, relief classes, distances to major water sources and earthquake epicenters, were extracted from DEM (digital elevation model) data using 85-meter resolution. The result is a grid map that classified the area into Low (16,834.98 km2), Medium (2,217.302 kin:) and High (2,013.55 km2) vulnerability to landslide-dammed occurrences. Overall, the model result has been validated by using a ROC (receiver operator characteristic) curve available in SPSS software. The model validation showed a 95.1 percent prediction accuracy that is considered satisfactory.展开更多
Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the...Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the whole range of the losses using a standard loss distribution. We tackle this modeling problem by proposing a three-component spliced regression model that can simultaneously model zeros, moderate and large losses and consider heterogeneous effects in mixture components. To apply our proposed model to Privacy Right Clearinghouse (PRC) data breach chronology, we segment geographical groups using unsupervised cluster analysis, and utilize a covariate-dependent probability to model zero losses, finite mixture distributions for moderate body and an extreme value distribution for large losses capturing the heavy-tailed nature of the loss data. Parameters and coefficients are estimated using the Expectation-Maximization (EM) algorithm. Combining with our frequency model (generalized linear mixed model) for data breaches, aggregate loss distributions are investigated and applications on cyber insurance pricing and risk management are discussed.展开更多
In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not...In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not just at predicting geophysical logging curve values but also innovatively mitigate hydrocarbon depletion observed in geochemical logging.Through a rigorous assessment,we explore the efficacy of eight regression models,bifurcated into linear and nonlinear groups,to accommodate the multifaceted nature of geological datasets.Our linear model suite encompasses the Standard Equation,Ridge Regression,Least Absolute Shrinkage and Selection Operator,and Elastic Net,each presenting distinct advantages.The Standard Equation serves as a foundational benchmark,whereas Ridge Regression implements penalty terms to counteract overfitting,thus bolstering model robustness in the presence of multicollinearity.The Least Absolute Shrinkage and Selection Operator for variable selection functions to streamline models,enhancing their interpretability,while Elastic Net amalgamates the merits of Ridge Regression and Least Absolute Shrinkage and Selection Operator,offering a harmonized solution to model complexity and comprehensibility.On the nonlinear front,Gradient Descent,Kernel Ridge Regression,Support Vector Regression,and Piecewise Function-Fitting methods introduce innovative approaches.Gradient Descent assures computational efficiency in optimizing solutions,Kernel Ridge Regression leverages the kernel trick to navigate nonlinear patterns,and Support Vector Regression is proficient in forecasting extremities,pivotal for exploration risk assessment.The Piecewise Function-Fitting approach,tailored for geological data,facilitates adaptable modeling of variable interrelations,accommodating abrupt data trend shifts.Our analysis identifies Ridge Regression,particularly when augmented by Piecewise Function-Fitting,as superior in recouping hydrocarbon losses,and underscoring its utility in resource quantification refinement.Meanwhile,Kernel Ridge Regression emerges as a noteworthy strategy in ameliorating porosity-logging curve prediction for well A,evidencing its aptness for intricate geological structures.This research attests to the scientific ascendancy and broad-spectrum relevance of these regression techniques over conventional methods while heralding new horizons for their deployment in the oil and gas sector.The insights garnered from these advanced modeling strategies are set to transform geological and engineering practices in hydrocarbon prediction,evaluation,and recovery.展开更多
In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluste...In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.展开更多
This study aims to analyze and predict the relationship between the average price per box in the cigarette market of City A and government procurement,providing a scientific basis and support for decision-making.By re...This study aims to analyze and predict the relationship between the average price per box in the cigarette market of City A and government procurement,providing a scientific basis and support for decision-making.By reviewing relevant theories and literature,qualitative prediction methods,regression prediction models,and other related theories were explored.Through the analysis of annual cigarette sales data and government procurement data in City A,a comprehensive understanding of the development of the tobacco industry and the economic trends of tobacco companies in the county was obtained.By predicting and analyzing the average price per box of cigarette sales across different years,corresponding prediction results were derived and compared with actual sales data.The prediction results indicate that the correlation coefficient between the average price per box of cigarette sales and government procurement is 0.982,implying that government procurement accounts for 96.4%of the changes in the average price per box of cigarettes.These findings offer an in-depth exploration of the relationship between the average price per box of cigarettes in City A and government procurement,providing a scientific foundation for corporate decision-making and market operations.展开更多
Municipal solid waste generation is strongly linked to rising human population and expanding urban areas, with significant implications on urban metabolism as well as space and place values redefinition. Effective man...Municipal solid waste generation is strongly linked to rising human population and expanding urban areas, with significant implications on urban metabolism as well as space and place values redefinition. Effective management performance of municipal solid waste management underscores the interdisciplinarity strategies. Such knowledge and skills are paramount to uncover the sources of waste generation as well as means of waste storage, collection, recycling, transportation, handling/treatment, disposal, and monitoring. This study was conducted in Dar es Salaam city. Driven by the curiosity model of the solid waste minimization performance at source, study data was collected using focus group discussion techniques to ward-level local government officers, which was triangulated with literature and documentary review. The main themes of the FGD were situational factors (SFA) and local government by-laws (LGBY). In the FGD session, sub-themes of SFA tricked to understand how MSW minimization is related to the presence and effect of services such as land use planning, availability of landfills, solid waste transfer stations, material recovery facilities, incinerators, solid waste collection bins, solid waste trucks, solid waste management budget and solid waste collection agents. Similarly, FGD on LGBY was extended by sub-themes such as contents of the by-law, community awareness of the by-law, and by-law enforcement mechanisms. While data preparation applied an analytical hierarchy process, data analysis applied an ordinary least square (OLS) regression model for sub-criteria that explain SFA and LGBY;and OLS standard residues as variables into geographically weighted regression with a resolution of 241 × 241 meter in ArcMap v10.5. Results showed that situational factors and local government by-laws have a strong relationship with the rate of minimizing solid waste dumping in water bodies (local R square = 0.94).展开更多
This study used spatial autoregression(SAR)model and geographically weighted regression(GWR)model to model the spatial patterns of farmland density and its temporal change in Gucheng County,Hubei Province,China in 199...This study used spatial autoregression(SAR)model and geographically weighted regression(GWR)model to model the spatial patterns of farmland density and its temporal change in Gucheng County,Hubei Province,China in 1999 and 2009,and discussed the difference between global and local spatial autocorrelations in terms of spatial heterogeneity and non-stationarity.Results showed that strong spatial positive correlations existed in the spatial distributions of farmland density,its temporal change and the driving factors,and the coefficients of spatial autocorrelations decreased as the spatial lag distance increased.SAR models revealed the global spatial relations between dependent and independent variables,while the GWR model showed the spatially varying fitting degree and local weighting coefficients of driving factors and farmland indices(i.e.,farmland density and temporal change).The GWR model has smooth process when constructing the farmland spatial model.The coefficients of GWR model can show the accurate influence degrees of different driving factors on the farmland at different geographical locations.The performance indices of GWR model showed that GWR model produced more accurate simulation results than other models at different times,and the improvement precision of GWR model was obvious.The global and local farmland models used in this study showed different characteristics in the spatial distributions of farmland indices at different scales,which may provide the theoretical basis for farmland protection from the influence of different driving factors.展开更多
The soil water status was investigated under soil surface mulching techniques and two drip line depths from the soil surface(DL).These techniques were black plastic film(BPF),palm tree waste(PTW),and no mulching(NM)as...The soil water status was investigated under soil surface mulching techniques and two drip line depths from the soil surface(DL).These techniques were black plastic film(BPF),palm tree waste(PTW),and no mulching(NM)as the control treatment.The DL were 15 cm and 25 cm,with surface drip irrigation used as the control.The results indicated that both the BPF and PTW mulching enhanced the soil water retention capacity and there was about 6%water saving in subsurface drip irrigation,compared with NM.Furthermore,the water savings at a DL of 25 cm were lower(15-20 mm)than those at a DL of 15 cm(19-24 mm),whereas surface drip irrigation consumed more water.The distribution of soil water content(θv)for BPF and PTW were more useful than for NM.Hence,mulching the soil with PTW is recommended due to the lower costs and using a DL of 15 cm.Theθv values were derived using multiple linear regression(MLR)and multiple nonlinear regression(MNLR)models.Multiple regression analysis revealed the superiority of the MLR over the MNLR model,which in the training and testing processes had coefficients of correlation of 0.86 and 0.88,root mean square errors of 0.37 and 0.35,and indices of agreement of 0.99 and 0.93,respectively,over the MNLR model.Moreover,DL and spacing from the drip line had a significant effect on the estimation of θv.展开更多
The global pandemic,coronavirus disease 2019(COVID-19),has significantly affected tourism,especially in Spain,as it was among the first countries to be affected by the pandemic and is among the world’s biggest touris...The global pandemic,coronavirus disease 2019(COVID-19),has significantly affected tourism,especially in Spain,as it was among the first countries to be affected by the pandemic and is among the world’s biggest tourist destinations.Stock market values are responding to the evolution of the pandemic,especially in the case of tourist companies.Therefore,being able to quantify this relationship allows us to predict the effect of the pandemic on shares in the tourism sector,thereby improving the response to the crisis by policymakers and investors.Accordingly,a dynamic regression model was developed to predict the behavior of shares in the Spanish tourism sector according to the evolution of the COVID-19 pandemic in the medium term.It has been confirmed that both the number of deaths and cases are good predictors of abnormal stock prices in the tourism sector.展开更多
Remaining useful life(RUL) prediction is one of the most crucial elements in prognostics and health management(PHM). Aiming at the imperfect prior information, this paper proposes an RUL prediction method based on a n...Remaining useful life(RUL) prediction is one of the most crucial elements in prognostics and health management(PHM). Aiming at the imperfect prior information, this paper proposes an RUL prediction method based on a nonlinear random coefficient regression(RCR) model with fusing failure time data.Firstly, some interesting natures of parameters estimation based on the nonlinear RCR model are given. Based on these natures,the failure time data can be fused as the prior information reasonably. Specifically, the fixed parameters are calculated by the field degradation data of the evaluated equipment and the prior information of random coefficient is estimated with fusing the failure time data of congeneric equipment. Then, the prior information of the random coefficient is updated online under the Bayesian framework, the probability density function(PDF) of the RUL with considering the limitation of the failure threshold is performed. Finally, two case studies are used for experimental verification. Compared with the traditional Bayesian method, the proposed method can effectively reduce the influence of imperfect prior information and improve the accuracy of RUL prediction.展开更多
Machine learning(ML) models provide great opportunities to accelerate novel material development, offering a virtual alternative to laborious and resource-intensive empirical methods. In this work, the second of a two...Machine learning(ML) models provide great opportunities to accelerate novel material development, offering a virtual alternative to laborious and resource-intensive empirical methods. In this work, the second of a two-part study, an ML approach is presented that offers accelerated digital design of Mg alloys. A systematic evaluation of four ML regression algorithms was explored to rationalise the complex relationships in Mg-alloy data and to capture the composition-processing-property patterns. Cross-validation and hold-out set validation techniques were utilised for unbiased estimation of model performance. Using atomic and thermodynamic properties of the alloys, feature augmentation was examined to define the most descriptive representation spaces for the alloy data. Additionally, a graphical user interface(GUI) webtool was developed to facilitate the use of the proposed models in predicting the mechanical properties of new Mg alloys. The results demonstrate that random forest regression model and neural network are robust models for predicting the ultimate tensile strength and ductility of Mg alloys, with accuracies of ~80% and 70% respectively. The developed models in this work are a step towards high-throughput screening of novel candidates for target mechanical properties and provide ML-guided alloy design.展开更多
In the era of big data,traditional regression models cannot deal with uncertain big data efficiently and accurately.In order to make up for this deficiency,this paper proposes a quantum fuzzy regression model,which us...In the era of big data,traditional regression models cannot deal with uncertain big data efficiently and accurately.In order to make up for this deficiency,this paper proposes a quantum fuzzy regression model,which uses fuzzy theory to describe the uncertainty in big data sets and uses quantum computing to exponentially improve the efficiency of data set preprocessing and parameter estimation.In this paper,data envelopment analysis(DEA)is used to calculate the degree of importance of each data point.Meanwhile,Harrow,Hassidim and Lloyd(HHL)algorithm and quantum swap circuits are used to improve the efficiency of high-dimensional data matrix calculation.The application of the quantum fuzzy regression model to smallscale financial data proves that its accuracy is greatly improved compared with the quantum regression model.Moreover,due to the introduction of quantum computing,the speed of dealing with high-dimensional data matrix has an exponential improvement compared with the fuzzy regression model.The quantum fuzzy regression model proposed in this paper combines the advantages of fuzzy theory and quantum computing which can efficiently calculate high-dimensional data matrix and complete parameter estimation using quantum computing while retaining the uncertainty in big data.Thus,it is a new model for efficient and accurate big data processing in uncertain environments.展开更多
The present paper proposes a new robust estimator for Poisson regression models. We used the weighted maximum likelihood estimators which are regarded as Mallows-type estimators. We perform a Monte Carlo simulation st...The present paper proposes a new robust estimator for Poisson regression models. We used the weighted maximum likelihood estimators which are regarded as Mallows-type estimators. We perform a Monte Carlo simulation study to assess the performance of a suggested estimator compared to the maximum likelihood estimator and some robust methods. The result shows that, in general, all robust methods in this paper perform better than the classical maximum likelihood estimators when the model contains outliers. The proposed estimators showed the best performance compared to other robust estimators.展开更多
The aim of this study was to model the Undrained Shear Strength (USS) of soil found in the coastal region of the Niger Delta in Nigeria with some soil properties. The undrained shear strength (USS) is a key parameter ...The aim of this study was to model the Undrained Shear Strength (USS) of soil found in the coastal region of the Niger Delta in Nigeria with some soil properties. The undrained shear strength (USS) is a key parameter needed for most geotechnical/structural designs. Accurate determination of the USS of soft clays can be challenging to obtain in the laboratory due to the difficulty in remoulding the clay to its in-situ conditions before testing and more accurate test such as Cone Penetration test (CPT) can be quite expensive. This study was carried out at Escravos site which is located in Delta state, Nigeria. Three Boreholes were drilled and soil samples were collected at 0.75 m intervals up to a depth of 45 m. Laboratory tests were used to obtain the moisture content, bulk unit weight, liquid and plastic limit, while CPT was used in obtaining the undrained shear strength. Classification of the soil samples was done by adopting the Unified Soil Classification System and various models relating the USS with the soil properties were developed. The result showed that most of the soils at Escravos site were predominately inorganic clay of high plasticity which are problematic due to the expansion and shrinking nature of this type of soil. The model developed showed that the soil properties that gave the best fit with the USS were the moisture content and effective stress of the soil. The coefficient of determination (R<sup>2</sup>) and the root mean square error (RMSE) obtained for this model were 0.805 and 6.37 KN/m<sup>2</sup>, respectively.展开更多
Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model int...Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model integrating multiple linear regression and infectious disease model.Firstly,we proposed the features that affect social network communication from three dimensions.Then,we predicted the node influence via multiple linear regression.Lastly,we used the node influence as the state transition of the infectious disease model to predict the trend of information dissemination in social networks.The experimental results on a real social network dataset showed that the prediction results of the model are consistent with the actual information dissemination trends.展开更多
A variety of test methodologies are commonly used to assess if a photovoltaic system can perform in line with expectations generated by a computer simulation. One of the commonly used methodologies across the PV indus...A variety of test methodologies are commonly used to assess if a photovoltaic system can perform in line with expectations generated by a computer simulation. One of the commonly used methodologies across the PV industry is an ASTM E2848. ASTM E2848-13, 2023 test method provides measurement and analysis procedures for determining the capacity of a specific photovoltaic system built in a particular place and in operation under natural sunlight. This test method is mainly used for acceptance testing of newly installed photovoltaic systems, reporting of DC or AC system performance, and monitoring of photovoltaic system performance. The purpose of the PV Capacity Test and modeled energy test is to verify that the integrated system formed from all components of the PV Project has a production capacity that achieves the Guaranteed Capacity and the Guaranteed modeled AEP under measured weather conditions that occur when each PV Capacity Test is conducted. In this paper, we will be discussing ASTM E2848 PV Capacity test plan purpose and scope, methodology, Selection of reporting conditions (RC), data requirements, calculation of results, reporting, challenges, acceptance criteria on pass/fail test results, Cure period, and Sole remedy for EPC contractors for bifacial irradiance.展开更多
Under-fitting problems usually occur in regression models for dam safety monitoring.To overcome the local convergence of the regression, a genetic algorithm (GA) was proposed using a real parameter coding, a ranking s...Under-fitting problems usually occur in regression models for dam safety monitoring.To overcome the local convergence of the regression, a genetic algorithm (GA) was proposed using a real parameter coding, a ranking selection operator, an arithmetical crossover operator and a uniform mutation operator, and calculated the least-square error of the observed and computed values as its fitness function. The elitist strategy was used to improve the speed of the convergence. After that, the modified genetic algorithm was applied to reassess the coefficients of the regression model and a genetic regression model was set up. As an example, a slotted gravity dam in the Northeast of China was introduced. The computational results show that the genetic regression model can solve the under-fitting problems perfectly.展开更多
A fuzzy observations-based radial basis function neural network (FORBFNN) is presented for modeling nonlinear systems in which the observations of response are imprecise but can be represented as fuzzy membership fu...A fuzzy observations-based radial basis function neural network (FORBFNN) is presented for modeling nonlinear systems in which the observations of response are imprecise but can be represented as fuzzy membership functions. In the FORBFNN model, the weight coefficients of nodes in the hidden layer are identified by using the fuzzy expectation-maximization ( EM ) algorithm, whereas the optimal number of these nodes as well as the centers and widths of radial basis functions are automatically constructed by using a data-driven method. Namely, the method starts with an initial node, and then a new node is added in a hidden layer according to some rules. This procedure is not terminated until the model meets the preset requirements. The method considers both the accuracy and complexity of the model. Numerical simulation results show that the modeling method is effective, and the established model has high prediction accuracy.展开更多
Because of the relativity among the parameters, partial least square regression(PLSR)was applied to build the model and get the regression equation. The improved algorithm simplified the calculating process greatly be...Because of the relativity among the parameters, partial least square regression(PLSR)was applied to build the model and get the regression equation. The improved algorithm simplified the calculating process greatly because of the reduction of calculation. The orthogonal design was adopted in this experiment. Every sample had strong representation, which could reduce the experimental time and obtain the overall test data. Combined with the formation problem of gas metal arc weld with big current, the auxiliary analysis technique of PLSR was discussed and the regression equation of form factors (i.e. surface width, weld penetration and weld reinforcement) to process parameters(i.e. wire feed rate, wire extension, welding speed, gas flow, welding voltage and welding current)was given. The correlativity structure among variables was analyzed and there was certain correlation between independent variables matrix X and dependent variables matrix Y. The regression analysis shows that the welding speed mainly influences the weld formation while the variation of gas flow in certain range has little influence on formation of weld. The fitting plot of regression accuracy is given. The fitting quality of regression equation is basically satisfactory.展开更多
文摘From economy to political administrations, education to health, environment to human rights, many problems we met have gained a global importance in recent days. Existing state systems, political parties and nation states are not adequate for solving these problems in question effectively on their own. Not only governments and local authorities but also voluntary organizations based on completely voluntary activities have significant roles in solving these problems. Effective performance of voluntary organizations depends on increasing volunteer population. Individuals' attitudes or their perception of understanding volunteerism play an important role in their contributions to voluntary organizations. The aim of this study is to determine individuals' ways of perceiving volunteerism concept and their tendency towards it. Furthermore, differences between men and women's perception and attitudes towards volunteerism concept have been examined. For this purpose, a survey has been conducted over university students of bachelor's degree. Tendencies and attitudes towards volunteerism compared to gender differences have been tested via logistic regression method. Research results reveal that women take part in voluntary activities more than men and women perceive volunteerism as "a political position" while men perceive volunteerism as "a learning atmosphere and learning process".
文摘A complex terrain and topography resulted in an enormous landslide-dammed area northeast of Afghanistan. Moreover, debris, rock avalanches, and landslides occurrences are the primary source of lakes created within the area. Recently, instances have increased because of the high displacement and mass movement by glacial and seismic activities. In this study, using GIS and R statistical software, we performed a logistic regression modeling in order to map and predict the probability of landslides-dammed occurrences. Totally, 361 lakes were mapped using Google Earth historical imagery. This total was divided into 253 (70%) lakes for modeling and 801 (30%) lakes for the model validation. They were randomly selected by creating a fishnet for the study area using Arc toolbox in GIS. Four independent variables that are mostly contributed to landslide-dammed occurrences consisting of slope angles, relief classes, distances to major water sources and earthquake epicenters, were extracted from DEM (digital elevation model) data using 85-meter resolution. The result is a grid map that classified the area into Low (16,834.98 km2), Medium (2,217.302 kin:) and High (2,013.55 km2) vulnerability to landslide-dammed occurrences. Overall, the model result has been validated by using a ROC (receiver operator characteristic) curve available in SPSS software. The model validation showed a 95.1 percent prediction accuracy that is considered satisfactory.
文摘Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the whole range of the losses using a standard loss distribution. We tackle this modeling problem by proposing a three-component spliced regression model that can simultaneously model zeros, moderate and large losses and consider heterogeneous effects in mixture components. To apply our proposed model to Privacy Right Clearinghouse (PRC) data breach chronology, we segment geographical groups using unsupervised cluster analysis, and utilize a covariate-dependent probability to model zero losses, finite mixture distributions for moderate body and an extreme value distribution for large losses capturing the heavy-tailed nature of the loss data. Parameters and coefficients are estimated using the Expectation-Maximization (EM) algorithm. Combining with our frequency model (generalized linear mixed model) for data breaches, aggregate loss distributions are investigated and applications on cyber insurance pricing and risk management are discussed.
文摘In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not just at predicting geophysical logging curve values but also innovatively mitigate hydrocarbon depletion observed in geochemical logging.Through a rigorous assessment,we explore the efficacy of eight regression models,bifurcated into linear and nonlinear groups,to accommodate the multifaceted nature of geological datasets.Our linear model suite encompasses the Standard Equation,Ridge Regression,Least Absolute Shrinkage and Selection Operator,and Elastic Net,each presenting distinct advantages.The Standard Equation serves as a foundational benchmark,whereas Ridge Regression implements penalty terms to counteract overfitting,thus bolstering model robustness in the presence of multicollinearity.The Least Absolute Shrinkage and Selection Operator for variable selection functions to streamline models,enhancing their interpretability,while Elastic Net amalgamates the merits of Ridge Regression and Least Absolute Shrinkage and Selection Operator,offering a harmonized solution to model complexity and comprehensibility.On the nonlinear front,Gradient Descent,Kernel Ridge Regression,Support Vector Regression,and Piecewise Function-Fitting methods introduce innovative approaches.Gradient Descent assures computational efficiency in optimizing solutions,Kernel Ridge Regression leverages the kernel trick to navigate nonlinear patterns,and Support Vector Regression is proficient in forecasting extremities,pivotal for exploration risk assessment.The Piecewise Function-Fitting approach,tailored for geological data,facilitates adaptable modeling of variable interrelations,accommodating abrupt data trend shifts.Our analysis identifies Ridge Regression,particularly when augmented by Piecewise Function-Fitting,as superior in recouping hydrocarbon losses,and underscoring its utility in resource quantification refinement.Meanwhile,Kernel Ridge Regression emerges as a noteworthy strategy in ameliorating porosity-logging curve prediction for well A,evidencing its aptness for intricate geological structures.This research attests to the scientific ascendancy and broad-spectrum relevance of these regression techniques over conventional methods while heralding new horizons for their deployment in the oil and gas sector.The insights garnered from these advanced modeling strategies are set to transform geological and engineering practices in hydrocarbon prediction,evaluation,and recovery.
文摘In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.
基金National Social Science Fund Project“Research on the Operational Risks and Prevention of Government Procurement of Community Services Project System”(Project No.21CSH018)Research and Application of SDM Cigarette Supply Strategy Based on Consumer Data Analysis(Project No.2023ASXM07)。
文摘This study aims to analyze and predict the relationship between the average price per box in the cigarette market of City A and government procurement,providing a scientific basis and support for decision-making.By reviewing relevant theories and literature,qualitative prediction methods,regression prediction models,and other related theories were explored.Through the analysis of annual cigarette sales data and government procurement data in City A,a comprehensive understanding of the development of the tobacco industry and the economic trends of tobacco companies in the county was obtained.By predicting and analyzing the average price per box of cigarette sales across different years,corresponding prediction results were derived and compared with actual sales data.The prediction results indicate that the correlation coefficient between the average price per box of cigarette sales and government procurement is 0.982,implying that government procurement accounts for 96.4%of the changes in the average price per box of cigarettes.These findings offer an in-depth exploration of the relationship between the average price per box of cigarettes in City A and government procurement,providing a scientific foundation for corporate decision-making and market operations.
文摘Municipal solid waste generation is strongly linked to rising human population and expanding urban areas, with significant implications on urban metabolism as well as space and place values redefinition. Effective management performance of municipal solid waste management underscores the interdisciplinarity strategies. Such knowledge and skills are paramount to uncover the sources of waste generation as well as means of waste storage, collection, recycling, transportation, handling/treatment, disposal, and monitoring. This study was conducted in Dar es Salaam city. Driven by the curiosity model of the solid waste minimization performance at source, study data was collected using focus group discussion techniques to ward-level local government officers, which was triangulated with literature and documentary review. The main themes of the FGD were situational factors (SFA) and local government by-laws (LGBY). In the FGD session, sub-themes of SFA tricked to understand how MSW minimization is related to the presence and effect of services such as land use planning, availability of landfills, solid waste transfer stations, material recovery facilities, incinerators, solid waste collection bins, solid waste trucks, solid waste management budget and solid waste collection agents. Similarly, FGD on LGBY was extended by sub-themes such as contents of the by-law, community awareness of the by-law, and by-law enforcement mechanisms. While data preparation applied an analytical hierarchy process, data analysis applied an ordinary least square (OLS) regression model for sub-criteria that explain SFA and LGBY;and OLS standard residues as variables into geographically weighted regression with a resolution of 241 × 241 meter in ArcMap v10.5. Results showed that situational factors and local government by-laws have a strong relationship with the rate of minimizing solid waste dumping in water bodies (local R square = 0.94).
基金Under the auspices of National Natural Science Foundation of China(No.40601073,41101192,41201571)Fundamental Research Funds for the Central Universities(No.2011PY112,2011QC041,2011QC091)Huazhong Agricultural University Scientific&Technological Self-innovation Foundation(No.2011SC21)
文摘This study used spatial autoregression(SAR)model and geographically weighted regression(GWR)model to model the spatial patterns of farmland density and its temporal change in Gucheng County,Hubei Province,China in 1999 and 2009,and discussed the difference between global and local spatial autocorrelations in terms of spatial heterogeneity and non-stationarity.Results showed that strong spatial positive correlations existed in the spatial distributions of farmland density,its temporal change and the driving factors,and the coefficients of spatial autocorrelations decreased as the spatial lag distance increased.SAR models revealed the global spatial relations between dependent and independent variables,while the GWR model showed the spatially varying fitting degree and local weighting coefficients of driving factors and farmland indices(i.e.,farmland density and temporal change).The GWR model has smooth process when constructing the farmland spatial model.The coefficients of GWR model can show the accurate influence degrees of different driving factors on the farmland at different geographical locations.The performance indices of GWR model showed that GWR model produced more accurate simulation results than other models at different times,and the improvement precision of GWR model was obvious.The global and local farmland models used in this study showed different characteristics in the spatial distributions of farmland indices at different scales,which may provide the theoretical basis for farmland protection from the influence of different driving factors.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Saud University for funding this work through research group(No.RG-1440-022).
文摘The soil water status was investigated under soil surface mulching techniques and two drip line depths from the soil surface(DL).These techniques were black plastic film(BPF),palm tree waste(PTW),and no mulching(NM)as the control treatment.The DL were 15 cm and 25 cm,with surface drip irrigation used as the control.The results indicated that both the BPF and PTW mulching enhanced the soil water retention capacity and there was about 6%water saving in subsurface drip irrigation,compared with NM.Furthermore,the water savings at a DL of 25 cm were lower(15-20 mm)than those at a DL of 15 cm(19-24 mm),whereas surface drip irrigation consumed more water.The distribution of soil water content(θv)for BPF and PTW were more useful than for NM.Hence,mulching the soil with PTW is recommended due to the lower costs and using a DL of 15 cm.Theθv values were derived using multiple linear regression(MLR)and multiple nonlinear regression(MNLR)models.Multiple regression analysis revealed the superiority of the MLR over the MNLR model,which in the training and testing processes had coefficients of correlation of 0.86 and 0.88,root mean square errors of 0.37 and 0.35,and indices of agreement of 0.99 and 0.93,respectively,over the MNLR model.Moreover,DL and spacing from the drip line had a significant effect on the estimation of θv.
文摘The global pandemic,coronavirus disease 2019(COVID-19),has significantly affected tourism,especially in Spain,as it was among the first countries to be affected by the pandemic and is among the world’s biggest tourist destinations.Stock market values are responding to the evolution of the pandemic,especially in the case of tourist companies.Therefore,being able to quantify this relationship allows us to predict the effect of the pandemic on shares in the tourism sector,thereby improving the response to the crisis by policymakers and investors.Accordingly,a dynamic regression model was developed to predict the behavior of shares in the Spanish tourism sector according to the evolution of the COVID-19 pandemic in the medium term.It has been confirmed that both the number of deaths and cases are good predictors of abnormal stock prices in the tourism sector.
基金supported by National Natural Science Foundation of China (61703410,61873175,62073336,61873273,61773386,61922089)。
文摘Remaining useful life(RUL) prediction is one of the most crucial elements in prognostics and health management(PHM). Aiming at the imperfect prior information, this paper proposes an RUL prediction method based on a nonlinear random coefficient regression(RCR) model with fusing failure time data.Firstly, some interesting natures of parameters estimation based on the nonlinear RCR model are given. Based on these natures,the failure time data can be fused as the prior information reasonably. Specifically, the fixed parameters are calculated by the field degradation data of the evaluated equipment and the prior information of random coefficient is estimated with fusing the failure time data of congeneric equipment. Then, the prior information of the random coefficient is updated online under the Bayesian framework, the probability density function(PDF) of the RUL with considering the limitation of the failure threshold is performed. Finally, two case studies are used for experimental verification. Compared with the traditional Bayesian method, the proposed method can effectively reduce the influence of imperfect prior information and improve the accuracy of RUL prediction.
基金the support of the Monash-IITB Academy Scholarshipthe Australian Research Council for funding the present research (DP190103592)。
文摘Machine learning(ML) models provide great opportunities to accelerate novel material development, offering a virtual alternative to laborious and resource-intensive empirical methods. In this work, the second of a two-part study, an ML approach is presented that offers accelerated digital design of Mg alloys. A systematic evaluation of four ML regression algorithms was explored to rationalise the complex relationships in Mg-alloy data and to capture the composition-processing-property patterns. Cross-validation and hold-out set validation techniques were utilised for unbiased estimation of model performance. Using atomic and thermodynamic properties of the alloys, feature augmentation was examined to define the most descriptive representation spaces for the alloy data. Additionally, a graphical user interface(GUI) webtool was developed to facilitate the use of the proposed models in predicting the mechanical properties of new Mg alloys. The results demonstrate that random forest regression model and neural network are robust models for predicting the ultimate tensile strength and ductility of Mg alloys, with accuracies of ~80% and 70% respectively. The developed models in this work are a step towards high-throughput screening of novel candidates for target mechanical properties and provide ML-guided alloy design.
基金This work is supported by the NationalNatural Science Foundation of China(No.62076042)the Key Research and Development Project of Sichuan Province(Nos.2021YFSY0012,2020YFG0307,2021YFG0332)+3 种基金the Science and Technology Innovation Project of Sichuan(No.2020017)the Key Research and Development Project of Chengdu(No.2019-YF05-02028-GX)the Innovation Team of Quantum Security Communication of Sichuan Province(No.17TD0009)the Academic and Technical Leaders Training Funding Support Projects of Sichuan Province(No.2016120080102643).
文摘In the era of big data,traditional regression models cannot deal with uncertain big data efficiently and accurately.In order to make up for this deficiency,this paper proposes a quantum fuzzy regression model,which uses fuzzy theory to describe the uncertainty in big data sets and uses quantum computing to exponentially improve the efficiency of data set preprocessing and parameter estimation.In this paper,data envelopment analysis(DEA)is used to calculate the degree of importance of each data point.Meanwhile,Harrow,Hassidim and Lloyd(HHL)algorithm and quantum swap circuits are used to improve the efficiency of high-dimensional data matrix calculation.The application of the quantum fuzzy regression model to smallscale financial data proves that its accuracy is greatly improved compared with the quantum regression model.Moreover,due to the introduction of quantum computing,the speed of dealing with high-dimensional data matrix has an exponential improvement compared with the fuzzy regression model.The quantum fuzzy regression model proposed in this paper combines the advantages of fuzzy theory and quantum computing which can efficiently calculate high-dimensional data matrix and complete parameter estimation using quantum computing while retaining the uncertainty in big data.Thus,it is a new model for efficient and accurate big data processing in uncertain environments.
文摘The present paper proposes a new robust estimator for Poisson regression models. We used the weighted maximum likelihood estimators which are regarded as Mallows-type estimators. We perform a Monte Carlo simulation study to assess the performance of a suggested estimator compared to the maximum likelihood estimator and some robust methods. The result shows that, in general, all robust methods in this paper perform better than the classical maximum likelihood estimators when the model contains outliers. The proposed estimators showed the best performance compared to other robust estimators.
文摘The aim of this study was to model the Undrained Shear Strength (USS) of soil found in the coastal region of the Niger Delta in Nigeria with some soil properties. The undrained shear strength (USS) is a key parameter needed for most geotechnical/structural designs. Accurate determination of the USS of soft clays can be challenging to obtain in the laboratory due to the difficulty in remoulding the clay to its in-situ conditions before testing and more accurate test such as Cone Penetration test (CPT) can be quite expensive. This study was carried out at Escravos site which is located in Delta state, Nigeria. Three Boreholes were drilled and soil samples were collected at 0.75 m intervals up to a depth of 45 m. Laboratory tests were used to obtain the moisture content, bulk unit weight, liquid and plastic limit, while CPT was used in obtaining the undrained shear strength. Classification of the soil samples was done by adopting the Unified Soil Classification System and various models relating the USS with the soil properties were developed. The result showed that most of the soils at Escravos site were predominately inorganic clay of high plasticity which are problematic due to the expansion and shrinking nature of this type of soil. The model developed showed that the soil properties that gave the best fit with the USS were the moisture content and effective stress of the soil. The coefficient of determination (R<sup>2</sup>) and the root mean square error (RMSE) obtained for this model were 0.805 and 6.37 KN/m<sup>2</sup>, respectively.
基金This work was supported by the 2021 Project of the“14th Five-Year Plan”of Shaanxi Education Science“Research on the Application of Educational Data Mining in Applied Undergraduate Teaching-Taking the Course of‘Computer Application Technology’as an Example”(SGH21Y0403)the Teaching Reform and Research Projects for Practical Teaching in 2022“Research on Practical Teaching of Applied Undergraduate Projects Based on‘Combination of Courses and Certificates”-Taking Computer Application Technology Courses as an Example”(SJJG02012)the 11th batch of Teaching Reform Research Project of Xi’an Jiaotong University City College“Project-Driven Cultivation and Research on Information Literacy of Applied Undergraduate Students in the Information Times-Taking Computer Application Technology Course Teaching as an Example”(111001).
文摘Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model integrating multiple linear regression and infectious disease model.Firstly,we proposed the features that affect social network communication from three dimensions.Then,we predicted the node influence via multiple linear regression.Lastly,we used the node influence as the state transition of the infectious disease model to predict the trend of information dissemination in social networks.The experimental results on a real social network dataset showed that the prediction results of the model are consistent with the actual information dissemination trends.
文摘A variety of test methodologies are commonly used to assess if a photovoltaic system can perform in line with expectations generated by a computer simulation. One of the commonly used methodologies across the PV industry is an ASTM E2848. ASTM E2848-13, 2023 test method provides measurement and analysis procedures for determining the capacity of a specific photovoltaic system built in a particular place and in operation under natural sunlight. This test method is mainly used for acceptance testing of newly installed photovoltaic systems, reporting of DC or AC system performance, and monitoring of photovoltaic system performance. The purpose of the PV Capacity Test and modeled energy test is to verify that the integrated system formed from all components of the PV Project has a production capacity that achieves the Guaranteed Capacity and the Guaranteed modeled AEP under measured weather conditions that occur when each PV Capacity Test is conducted. In this paper, we will be discussing ASTM E2848 PV Capacity test plan purpose and scope, methodology, Selection of reporting conditions (RC), data requirements, calculation of results, reporting, challenges, acceptance criteria on pass/fail test results, Cure period, and Sole remedy for EPC contractors for bifacial irradiance.
文摘Under-fitting problems usually occur in regression models for dam safety monitoring.To overcome the local convergence of the regression, a genetic algorithm (GA) was proposed using a real parameter coding, a ranking selection operator, an arithmetical crossover operator and a uniform mutation operator, and calculated the least-square error of the observed and computed values as its fitness function. The elitist strategy was used to improve the speed of the convergence. After that, the modified genetic algorithm was applied to reassess the coefficients of the regression model and a genetic regression model was set up. As an example, a slotted gravity dam in the Northeast of China was introduced. The computational results show that the genetic regression model can solve the under-fitting problems perfectly.
基金The National Natural Science Foundation of China(No.51106025,51106027,51036002)Specialized Research Fund for the Doctoral Program of Higher Education(No.20130092110061)the Youth Foundation of Nanjing Institute of Technology(No.QKJA201303)
文摘A fuzzy observations-based radial basis function neural network (FORBFNN) is presented for modeling nonlinear systems in which the observations of response are imprecise but can be represented as fuzzy membership functions. In the FORBFNN model, the weight coefficients of nodes in the hidden layer are identified by using the fuzzy expectation-maximization ( EM ) algorithm, whereas the optimal number of these nodes as well as the centers and widths of radial basis functions are automatically constructed by using a data-driven method. Namely, the method starts with an initial node, and then a new node is added in a hidden layer according to some rules. This procedure is not terminated until the model meets the preset requirements. The method considers both the accuracy and complexity of the model. Numerical simulation results show that the modeling method is effective, and the established model has high prediction accuracy.
文摘Because of the relativity among the parameters, partial least square regression(PLSR)was applied to build the model and get the regression equation. The improved algorithm simplified the calculating process greatly because of the reduction of calculation. The orthogonal design was adopted in this experiment. Every sample had strong representation, which could reduce the experimental time and obtain the overall test data. Combined with the formation problem of gas metal arc weld with big current, the auxiliary analysis technique of PLSR was discussed and the regression equation of form factors (i.e. surface width, weld penetration and weld reinforcement) to process parameters(i.e. wire feed rate, wire extension, welding speed, gas flow, welding voltage and welding current)was given. The correlativity structure among variables was analyzed and there was certain correlation between independent variables matrix X and dependent variables matrix Y. The regression analysis shows that the welding speed mainly influences the weld formation while the variation of gas flow in certain range has little influence on formation of weld. The fitting plot of regression accuracy is given. The fitting quality of regression equation is basically satisfactory.