In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not...In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not just at predicting geophysical logging curve values but also innovatively mitigate hydrocarbon depletion observed in geochemical logging.Through a rigorous assessment,we explore the efficacy of eight regression models,bifurcated into linear and nonlinear groups,to accommodate the multifaceted nature of geological datasets.Our linear model suite encompasses the Standard Equation,Ridge Regression,Least Absolute Shrinkage and Selection Operator,and Elastic Net,each presenting distinct advantages.The Standard Equation serves as a foundational benchmark,whereas Ridge Regression implements penalty terms to counteract overfitting,thus bolstering model robustness in the presence of multicollinearity.The Least Absolute Shrinkage and Selection Operator for variable selection functions to streamline models,enhancing their interpretability,while Elastic Net amalgamates the merits of Ridge Regression and Least Absolute Shrinkage and Selection Operator,offering a harmonized solution to model complexity and comprehensibility.On the nonlinear front,Gradient Descent,Kernel Ridge Regression,Support Vector Regression,and Piecewise Function-Fitting methods introduce innovative approaches.Gradient Descent assures computational efficiency in optimizing solutions,Kernel Ridge Regression leverages the kernel trick to navigate nonlinear patterns,and Support Vector Regression is proficient in forecasting extremities,pivotal for exploration risk assessment.The Piecewise Function-Fitting approach,tailored for geological data,facilitates adaptable modeling of variable interrelations,accommodating abrupt data trend shifts.Our analysis identifies Ridge Regression,particularly when augmented by Piecewise Function-Fitting,as superior in recouping hydrocarbon losses,and underscoring its utility in resource quantification refinement.Meanwhile,Kernel Ridge Regression emerges as a noteworthy strategy in ameliorating porosity-logging curve prediction for well A,evidencing its aptness for intricate geological structures.This research attests to the scientific ascendancy and broad-spectrum relevance of these regression techniques over conventional methods while heralding new horizons for their deployment in the oil and gas sector.The insights garnered from these advanced modeling strategies are set to transform geological and engineering practices in hydrocarbon prediction,evaluation,and recovery.展开更多
Aiming at the problems of low accuracy,long time consumption,and failure to obtain quantita-tive fault identification results of existing automatic fault identification technic,a fault recognition method based on clus...Aiming at the problems of low accuracy,long time consumption,and failure to obtain quantita-tive fault identification results of existing automatic fault identification technic,a fault recognition method based on clustering linear regression is proposed.Firstly,Hough transform is used to detect the line segment of the enhanced image obtained by the coherence cube algorithm.Secondly,the endpoint of the line segment detected by Hough transform is taken as the key point,and the adaptive clustering linear regression algorithm is used to cluster the key points adaptively according to the lin-ear relationship between them.Finally,a fault is generated from each category of key points based on least squares curve fitting method to realize fault identification.To verify the feasibility and pro-gressiveness of the proposed method,it is compared with the traditional method and the latest meth-od on the actual seismic data through experiments,and the effectiveness of the proposed method is verified by the experimental results on the actual seismic data.展开更多
Cloud infrastructural resource optimization is the process of precisely selecting the allocating the correct resources either to a workload or application.When workload execution,accuracy,and cost are accurately stabi...Cloud infrastructural resource optimization is the process of precisely selecting the allocating the correct resources either to a workload or application.When workload execution,accuracy,and cost are accurately stabilized in opposition to the best possible framework in real-time,efficiency is attained.In addition,every workload or application required for the framework is characteristic and these essentials change over time.But,the existing method was failed to ensure the high Quality of Service(QoS).In order to address this issue,a Tricube Weighted Linear Regression-based Inter Quartile(TWLR-IQ)for Cloud Infrastructural Resource Optimization is introduced.A Tricube Weighted Linear Regression is presented in the proposed method to estimate the resources(i.e.,CPU,RAM,and network bandwidth utilization)based on the usage history in each cloud server.Then,Inter Quartile Range is applied to efficiently predict the overload hosts for ensuring a smooth migration.Experimental results show that our proposed method is better than the approach in Cloudsim under various performance metrics.The results clearly showed that the proposed method can reduce the energy consumption and provide a high level of commitment with ensuring the minimum number of Virtual Machine(VM)Migrations as compared to the state-of-the-art methods.展开更多
A Mobile Ad-hoc NETwork(MANET)contains numerous mobile nodes,and it forms a structure-less network associated with wireless links.But,the node movement is the key feature of MANETs;hence,the quick action of the nodes ...A Mobile Ad-hoc NETwork(MANET)contains numerous mobile nodes,and it forms a structure-less network associated with wireless links.But,the node movement is the key feature of MANETs;hence,the quick action of the nodes guides a link failure.This link failure creates more data packet drops that can cause a long time delay.As a result,measuring accurate link failure time is the key factor in the MANET.This paper presents a Fuzzy Linear Regression Method to measure Link Failure(FLRLF)and provide an optimal route in the MANET-Internet of Things(IoT).This work aims to predict link failure and improve routing efficiency in MANET.The Fuzzy Linear Regression Method(FLRM)measures the long lifespan link based on the link failure.The mobile node group is built by the Received Signal Strength(RSS).The Hill Climbing(HC)method selects the Group Leader(GL)based on node mobility,node degree and node energy.Additionally,it uses a Data Gathering node forward the infor-mation from GL to the sink node through multiple GL.The GL is identified by linking lifespan and energy using the Particle Swarm Optimization(PSO)algo-rithm.The simulation results demonstrate that the FLRLF approach increases the GL lifespan and minimizes the link failure time in the MANET.展开更多
The development of prediction supports is a critical step in information systems engineering in this era defined by the knowledge economy, the hub of which is big data. Currently, the lack of a predictive model, wheth...The development of prediction supports is a critical step in information systems engineering in this era defined by the knowledge economy, the hub of which is big data. Currently, the lack of a predictive model, whether qualitative or quantitative, depending on a company’s areas of intervention can handicap or weaken its competitive capacities, endangering its survival. In terms of quantitative prediction, depending on the efficacy criteria, a variety of methods and/or tools are available. The multiple linear regression method is one of the methods used for this purpose. A linear regression model is a regression model of an explained variable on one or more explanatory variables in which the function that links the explanatory variables to the explained variable has linear parameters. The purpose of this work is to demonstrate how to use multiple linear regressions, which is one aspect of decisional mathematics. The use of multiple linear regressions on random data, which can be replaced by real data collected by or from organizations, provides decision makers with reliable data knowledge. As a result, machine learning methods can provide decision makers with relevant and trustworthy data. The main goal of this article is therefore to define the objective function on which the influencing factors for its optimization will be defined using the linear regression method.展开更多
Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model int...Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model integrating multiple linear regression and infectious disease model.Firstly,we proposed the features that affect social network communication from three dimensions.Then,we predicted the node influence via multiple linear regression.Lastly,we used the node influence as the state transition of the infectious disease model to predict the trend of information dissemination in social networks.The experimental results on a real social network dataset showed that the prediction results of the model are consistent with the actual information dissemination trends.展开更多
This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression model...This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression models are investigated, the first of which corresponds to systems with a negative feedback, while the second class presents systems without the feedback. In the first case the use of shrinkage estimators, especially the Principal Component estimator, is inappropriate but is possible in the second case with the right choice of the regularization parameter or of the number of principal components included in the regression model. This fact is substantiated by the study of the distribution of the random variable , where b is the LS estimate and β is the true coefficient, since the form of this distribution is the basic characteristic of the specified classes. For this study, a regression approximation of the distribution of the event based on the Edgeworth series was developed. Also, alternative approaches are examined to resolve the multicollinearity issue, including an application of the known Inequality Constrained Least Squares method and the Dual estimator method proposed by the author. It is shown that with a priori information the Euclidean distance between the estimates and the true coefficients can be significantly reduced.展开更多
The rock matrix bulk modulus or its inverse, the compressive coefficient, is an important input parameter for fluid substitution by the Biot-Gassmann equation in reservoir prediction. However, it is not easy to accura...The rock matrix bulk modulus or its inverse, the compressive coefficient, is an important input parameter for fluid substitution by the Biot-Gassmann equation in reservoir prediction. However, it is not easy to accurately estimate the bulk modulus by using conventional methods. In this paper, we present a new linear regression equation for calculating the parameter. In order to get this equation, we first derive a simplified Gassmann equation by using a reasonable assumption in which the compressive coefficient of the saturated pore fluid is much greater than the rock matrix, and, second, we use the Eshelby- Walsh relation to replace the equivalent modulus of a dry rock in the Gassmann equation. Results from the rock physics analysis of rock sample from a carbonate area show that rock matrix compressive coefficients calculated with water-saturated and dry rock samples using the linear regression method are very close (their error is less than 1%). This means the new method is accurate and reliable.展开更多
Abstract Using the method of stepwise multivariate linear regression (SMLR), the quantitative structure activity relationships (QSAR) of two isomeric series of taxol and its derivatives have been studied. It was foun...Abstract Using the method of stepwise multivariate linear regression (SMLR), the quantitative structure activity relationships (QSAR) of two isomeric series of taxol and its derivatives have been studied. It was found that the molar refractivity of the C3′substituent of the C13 side chain has significant correlation with its activity. We deduce that structural changes in the C3′substituents may be critical to the anticancer function. It would be useful to the design and synthesis of taxol like compounds with improved activities.展开更多
In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
Many properties of fruit are influenced by plant nutrition. Fruit firmness is one of the most important fruit characteristics and determines post-harvest life of the fruit, in recent decades, artificial intelligence s...Many properties of fruit are influenced by plant nutrition. Fruit firmness is one of the most important fruit characteristics and determines post-harvest life of the fruit, in recent decades, artificial intelligence systems were employed for developing predictive models to estimate and predict many agriculture processes. In the present study, the predictive capabilities of multiple linear regressions (MLR) and artificial neural networks (ANNs) are evaluated to estimate fruit firmness in six months, including each of nutrients concentrations (nitrogen (N), potassium (K), calcium (Ca) and magnesium (Mg)) alone (P1), com- bination of nutrients concentrations (P2), nutrient concentration ratios alone (P3), and combination of nutrient concentrations and nutrient concentration ratios (P4). The results showed that MLR model estimated fruit firmness more accuracy than ANN model in three datasets (P1, P2 and P4). However, the application of P3 (N/Ca ratio) as the input dataset in ANN model improved the prediction of fruit firmness than the MLR model. Correlation coefficient and root mean squared error (RMSE) were 0.850 and 0.539 between the measured and the estimated data by the ANN model, respectively. Generally, the ANN model showed greater potential in determining the relationship between 6-mon-fruit firmness and nutrients concentration.展开更多
This paper presents an analysis to forecast the loads of an isolated area where the history of load is not available or the history may not represent the realistic demand of electricity. The analysis is done through l...This paper presents an analysis to forecast the loads of an isolated area where the history of load is not available or the history may not represent the realistic demand of electricity. The analysis is done through linear regression and based on the identification of factors on which electrical load growth depends. To determine the identification factors, areas are selected whose histories of load growth rate known and the load growth deciding factors are similar to those of the isolated area. The proposed analysis is applied to an isolated area of Bangladesh, called Swandip where a past history of electrical load demand is not available and also there is no possibility of connecting the area with the main land grid system.展开更多
This article studies parametric component and nonparametric component estimators in a semiparametric regression model with linear time series errors; their r-th mean consistency and complete consistency are obtained u...This article studies parametric component and nonparametric component estimators in a semiparametric regression model with linear time series errors; their r-th mean consistency and complete consistency are obtained under suitable conditions. Finally, the author shows that the usual weight functions based on nearest neighbor methods satisfy the designed assumptions imposed.展开更多
The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to elimin...The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.展开更多
Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea ...Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea (SCS) based on the simple ocean data assimilation (SODA) dataset. The spatio-temporal distributions of the MLD, the buoyancy flux (combining the NHF and the NFF) and the wind stress of the SCS were presented. Then using an oceanic vertical mixing model, the MLD after a certain time under the same initial conditions but various pairs of boundary conditions (the three factors) was simulated. Applying the MLR method to the results, regression equations which modeling the relationship between the simulated MLD and the three factors were calculated. The equations indicate that when the NHF was negative, it was the primary driver of the mixed layer deepening; and when the NHF was positive, the wind stress played a more important role than that of the NHF while the NFF had the least effect. When the NHF was positive, the relative quantitative effects of the wind stress, the NHF, and the NFF were about i0, 6 and 2. The above conclusions were applied to explaining the spatio-temporal distributions of the MLD in the SCS and thus proved to be valid.展开更多
A class of estimators of the mean survival time with interval censored data are studied by unbiased transformation method. The estimators are constructed based on the observations to ensure unbiasedness in the sense t...A class of estimators of the mean survival time with interval censored data are studied by unbiased transformation method. The estimators are constructed based on the observations to ensure unbiasedness in the sense that the estimators in a certain class have the same expectation as the mean survival time. The estimators have good properties such as strong consistency (with the rate of O(n^-1/1 (log log n)^1/2)) and asymptotic normality. The application to linear regression is considered and the simulation reports are given.展开更多
As a mono-sodium salt form of alendronic acid,alendronate sodium presents multi-level ionization for the dissociation of its four hydroxyl groups.The dissociation constants of alendronate sodium were determined in thi...As a mono-sodium salt form of alendronic acid,alendronate sodium presents multi-level ionization for the dissociation of its four hydroxyl groups.The dissociation constants of alendronate sodium were determined in this work by studying the piecewise linear relationship between volume of titrant and p H value based on acidbase potentiometric titration reaction.The distribution curves of alendronate sodium were drawn according to the determined p Ka values.There were 4 dissociation constants(pKa_1=2.43,pKa_2=7.55,pKa_3=10.80,pKa_4=11.99,respectively) of alendronate sodium,and 12 existing forms,of which 4 could be ignored,existing in different p H environments.展开更多
Tegillarca granosa(T.granosa)is susceptible to heavy metals,which may pose a threat to consumer health.Thus,healthy and polluted T.granosa should be distinguished quickly.This study aimed to rapidly identify heavy met...Tegillarca granosa(T.granosa)is susceptible to heavy metals,which may pose a threat to consumer health.Thus,healthy and polluted T.granosa should be distinguished quickly.This study aimed to rapidly identify heavy metal pollution by using laser-induced breakdown spectroscopy(LIBS)coupled with linear regression classification(LRC).Five types of T.granosa were studied,namely,Cd-,Zn-,Pb-contaminated,mixed contaminated,and control samples.Threshold method was applied to extract the significant variables from LIBS spectra.Then,LRC was used to classify the different types of T.granosa.Other classification models and feature selection methods were used for comparison.LRC was the best model,achieving an accuracy of 90.67%.Results indicated that LIBS combined with LRC is effective and feasible for T.granosa heavy metal detection.展开更多
In current paper, a quantitative structure-activity relationship (QSAR) study was performed for the prediction of acute toxicity of aromatic amines. A set of 56 compounds was randomly divided into a training set of ...In current paper, a quantitative structure-activity relationship (QSAR) study was performed for the prediction of acute toxicity of aromatic amines. A set of 56 compounds was randomly divided into a training set of 46 compounds and a test set of 10 compounds. The electronic and topological descriptors computed by the Scigress package and Dragon software were used as predictor variables. Multiple linear regression (MLR) and support vector machine (SVM) were utilized to build the linear and nonlinear QSAR models, respectively. The obtained models with five descriptors show strong predictive ability. The linear model fits the training set with R2 = 0.71, with higher SVM values of R2 = 0.77. The validation results obtained from the test set indicate that the SVM model is comparable or superior to that obtained by MLR, both in terms of prediction ability and robustness.展开更多
Support Vector-based learning methods are an important part of Computational Intelligence techniques. Recent efforts have been dealing with the problem of learning from very large datasets. This paper reviews the most...Support Vector-based learning methods are an important part of Computational Intelligence techniques. Recent efforts have been dealing with the problem of learning from very large datasets. This paper reviews the most commonly used formulations of support vector machines for regression (SVRs) aiming to emphasize its usability on large-scale applications. We review the general concept of support vector machines (SVMs), address the state-of-the-art on training methods SVMs, and explain the fundamental principle of SVRs. The most common learning methods for SVRs are introduced and linear programming-based SVR formulations are explained emphasizing its suitability for large-scale learning. Finally, this paper also discusses some open problems and current trends.展开更多
文摘In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not just at predicting geophysical logging curve values but also innovatively mitigate hydrocarbon depletion observed in geochemical logging.Through a rigorous assessment,we explore the efficacy of eight regression models,bifurcated into linear and nonlinear groups,to accommodate the multifaceted nature of geological datasets.Our linear model suite encompasses the Standard Equation,Ridge Regression,Least Absolute Shrinkage and Selection Operator,and Elastic Net,each presenting distinct advantages.The Standard Equation serves as a foundational benchmark,whereas Ridge Regression implements penalty terms to counteract overfitting,thus bolstering model robustness in the presence of multicollinearity.The Least Absolute Shrinkage and Selection Operator for variable selection functions to streamline models,enhancing their interpretability,while Elastic Net amalgamates the merits of Ridge Regression and Least Absolute Shrinkage and Selection Operator,offering a harmonized solution to model complexity and comprehensibility.On the nonlinear front,Gradient Descent,Kernel Ridge Regression,Support Vector Regression,and Piecewise Function-Fitting methods introduce innovative approaches.Gradient Descent assures computational efficiency in optimizing solutions,Kernel Ridge Regression leverages the kernel trick to navigate nonlinear patterns,and Support Vector Regression is proficient in forecasting extremities,pivotal for exploration risk assessment.The Piecewise Function-Fitting approach,tailored for geological data,facilitates adaptable modeling of variable interrelations,accommodating abrupt data trend shifts.Our analysis identifies Ridge Regression,particularly when augmented by Piecewise Function-Fitting,as superior in recouping hydrocarbon losses,and underscoring its utility in resource quantification refinement.Meanwhile,Kernel Ridge Regression emerges as a noteworthy strategy in ameliorating porosity-logging curve prediction for well A,evidencing its aptness for intricate geological structures.This research attests to the scientific ascendancy and broad-spectrum relevance of these regression techniques over conventional methods while heralding new horizons for their deployment in the oil and gas sector.The insights garnered from these advanced modeling strategies are set to transform geological and engineering practices in hydrocarbon prediction,evaluation,and recovery.
基金the National Natural Science Foundation of China(No.41804135)the Key Laboratory of Petroleum Resources Research,Institute of Geology and Geophysics,Chinese Academy of Sciences,Open Project(No.KLOR2018-9)the Beijing Information Science and Technology University Research Fund Project(No.2025025).
文摘Aiming at the problems of low accuracy,long time consumption,and failure to obtain quantita-tive fault identification results of existing automatic fault identification technic,a fault recognition method based on clustering linear regression is proposed.Firstly,Hough transform is used to detect the line segment of the enhanced image obtained by the coherence cube algorithm.Secondly,the endpoint of the line segment detected by Hough transform is taken as the key point,and the adaptive clustering linear regression algorithm is used to cluster the key points adaptively according to the lin-ear relationship between them.Finally,a fault is generated from each category of key points based on least squares curve fitting method to realize fault identification.To verify the feasibility and pro-gressiveness of the proposed method,it is compared with the traditional method and the latest meth-od on the actual seismic data through experiments,and the effectiveness of the proposed method is verified by the experimental results on the actual seismic data.
文摘Cloud infrastructural resource optimization is the process of precisely selecting the allocating the correct resources either to a workload or application.When workload execution,accuracy,and cost are accurately stabilized in opposition to the best possible framework in real-time,efficiency is attained.In addition,every workload or application required for the framework is characteristic and these essentials change over time.But,the existing method was failed to ensure the high Quality of Service(QoS).In order to address this issue,a Tricube Weighted Linear Regression-based Inter Quartile(TWLR-IQ)for Cloud Infrastructural Resource Optimization is introduced.A Tricube Weighted Linear Regression is presented in the proposed method to estimate the resources(i.e.,CPU,RAM,and network bandwidth utilization)based on the usage history in each cloud server.Then,Inter Quartile Range is applied to efficiently predict the overload hosts for ensuring a smooth migration.Experimental results show that our proposed method is better than the approach in Cloudsim under various performance metrics.The results clearly showed that the proposed method can reduce the energy consumption and provide a high level of commitment with ensuring the minimum number of Virtual Machine(VM)Migrations as compared to the state-of-the-art methods.
文摘A Mobile Ad-hoc NETwork(MANET)contains numerous mobile nodes,and it forms a structure-less network associated with wireless links.But,the node movement is the key feature of MANETs;hence,the quick action of the nodes guides a link failure.This link failure creates more data packet drops that can cause a long time delay.As a result,measuring accurate link failure time is the key factor in the MANET.This paper presents a Fuzzy Linear Regression Method to measure Link Failure(FLRLF)and provide an optimal route in the MANET-Internet of Things(IoT).This work aims to predict link failure and improve routing efficiency in MANET.The Fuzzy Linear Regression Method(FLRM)measures the long lifespan link based on the link failure.The mobile node group is built by the Received Signal Strength(RSS).The Hill Climbing(HC)method selects the Group Leader(GL)based on node mobility,node degree and node energy.Additionally,it uses a Data Gathering node forward the infor-mation from GL to the sink node through multiple GL.The GL is identified by linking lifespan and energy using the Particle Swarm Optimization(PSO)algo-rithm.The simulation results demonstrate that the FLRLF approach increases the GL lifespan and minimizes the link failure time in the MANET.
文摘The development of prediction supports is a critical step in information systems engineering in this era defined by the knowledge economy, the hub of which is big data. Currently, the lack of a predictive model, whether qualitative or quantitative, depending on a company’s areas of intervention can handicap or weaken its competitive capacities, endangering its survival. In terms of quantitative prediction, depending on the efficacy criteria, a variety of methods and/or tools are available. The multiple linear regression method is one of the methods used for this purpose. A linear regression model is a regression model of an explained variable on one or more explanatory variables in which the function that links the explanatory variables to the explained variable has linear parameters. The purpose of this work is to demonstrate how to use multiple linear regressions, which is one aspect of decisional mathematics. The use of multiple linear regressions on random data, which can be replaced by real data collected by or from organizations, provides decision makers with reliable data knowledge. As a result, machine learning methods can provide decision makers with relevant and trustworthy data. The main goal of this article is therefore to define the objective function on which the influencing factors for its optimization will be defined using the linear regression method.
基金This work was supported by the 2021 Project of the“14th Five-Year Plan”of Shaanxi Education Science“Research on the Application of Educational Data Mining in Applied Undergraduate Teaching-Taking the Course of‘Computer Application Technology’as an Example”(SGH21Y0403)the Teaching Reform and Research Projects for Practical Teaching in 2022“Research on Practical Teaching of Applied Undergraduate Projects Based on‘Combination of Courses and Certificates”-Taking Computer Application Technology Courses as an Example”(SJJG02012)the 11th batch of Teaching Reform Research Project of Xi’an Jiaotong University City College“Project-Driven Cultivation and Research on Information Literacy of Applied Undergraduate Students in the Information Times-Taking Computer Application Technology Course Teaching as an Example”(111001).
文摘Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model integrating multiple linear regression and infectious disease model.Firstly,we proposed the features that affect social network communication from three dimensions.Then,we predicted the node influence via multiple linear regression.Lastly,we used the node influence as the state transition of the infectious disease model to predict the trend of information dissemination in social networks.The experimental results on a real social network dataset showed that the prediction results of the model are consistent with the actual information dissemination trends.
文摘This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression models are investigated, the first of which corresponds to systems with a negative feedback, while the second class presents systems without the feedback. In the first case the use of shrinkage estimators, especially the Principal Component estimator, is inappropriate but is possible in the second case with the right choice of the regularization parameter or of the number of principal components included in the regression model. This fact is substantiated by the study of the distribution of the random variable , where b is the LS estimate and β is the true coefficient, since the form of this distribution is the basic characteristic of the specified classes. For this study, a regression approximation of the distribution of the event based on the Edgeworth series was developed. Also, alternative approaches are examined to resolve the multicollinearity issue, including an application of the known Inequality Constrained Least Squares method and the Dual estimator method proposed by the author. It is shown that with a priori information the Euclidean distance between the estimates and the true coefficients can be significantly reduced.
基金supported by the National Nature Science Foundation of China (Grant Noss 40739907 and 40774064)National Science and Technology Major Project (Grant No. 2008ZX05025-003)
文摘The rock matrix bulk modulus or its inverse, the compressive coefficient, is an important input parameter for fluid substitution by the Biot-Gassmann equation in reservoir prediction. However, it is not easy to accurately estimate the bulk modulus by using conventional methods. In this paper, we present a new linear regression equation for calculating the parameter. In order to get this equation, we first derive a simplified Gassmann equation by using a reasonable assumption in which the compressive coefficient of the saturated pore fluid is much greater than the rock matrix, and, second, we use the Eshelby- Walsh relation to replace the equivalent modulus of a dry rock in the Gassmann equation. Results from the rock physics analysis of rock sample from a carbonate area show that rock matrix compressive coefficients calculated with water-saturated and dry rock samples using the linear regression method are very close (their error is less than 1%). This means the new method is accurate and reliable.
文摘Abstract Using the method of stepwise multivariate linear regression (SMLR), the quantitative structure activity relationships (QSAR) of two isomeric series of taxol and its derivatives have been studied. It was found that the molar refractivity of the C3′substituent of the C13 side chain has significant correlation with its activity. We deduce that structural changes in the C3′substituents may be critical to the anticancer function. It would be useful to the design and synthesis of taxol like compounds with improved activities.
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
文摘Many properties of fruit are influenced by plant nutrition. Fruit firmness is one of the most important fruit characteristics and determines post-harvest life of the fruit, in recent decades, artificial intelligence systems were employed for developing predictive models to estimate and predict many agriculture processes. In the present study, the predictive capabilities of multiple linear regressions (MLR) and artificial neural networks (ANNs) are evaluated to estimate fruit firmness in six months, including each of nutrients concentrations (nitrogen (N), potassium (K), calcium (Ca) and magnesium (Mg)) alone (P1), com- bination of nutrients concentrations (P2), nutrient concentration ratios alone (P3), and combination of nutrient concentrations and nutrient concentration ratios (P4). The results showed that MLR model estimated fruit firmness more accuracy than ANN model in three datasets (P1, P2 and P4). However, the application of P3 (N/Ca ratio) as the input dataset in ANN model improved the prediction of fruit firmness than the MLR model. Correlation coefficient and root mean squared error (RMSE) were 0.850 and 0.539 between the measured and the estimated data by the ANN model, respectively. Generally, the ANN model showed greater potential in determining the relationship between 6-mon-fruit firmness and nutrients concentration.
文摘This paper presents an analysis to forecast the loads of an isolated area where the history of load is not available or the history may not represent the realistic demand of electricity. The analysis is done through linear regression and based on the identification of factors on which electrical load growth depends. To determine the identification factors, areas are selected whose histories of load growth rate known and the load growth deciding factors are similar to those of the isolated area. The proposed analysis is applied to an isolated area of Bangladesh, called Swandip where a past history of electrical load demand is not available and also there is no possibility of connecting the area with the main land grid system.
基金This article was supported by the National Natural Science Foundation of China(10571001)the Innovation Group Foundation of Anhui University
文摘This article studies parametric component and nonparametric component estimators in a semiparametric regression model with linear time series errors; their r-th mean consistency and complete consistency are obtained under suitable conditions. Finally, the author shows that the usual weight functions based on nearest neighbor methods satisfy the designed assumptions imposed.
基金supported by the National Natural Science Foundation of China(71071077)the Ministry of Education Key Project of National Educational Science Planning(DFA090215)+1 种基金China Postdoctoral Science Foundation(20100481137)Funding of Jiangsu Innovation Program for Graduate Education(CXZZ11-0226)
文摘The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.
基金The National Natural Science Foundation of China under contract No.11174235the Science and Technology Development Project of Shaanxi Province of China under contract No.2010KJXX-02+2 种基金the Program for New Century Excellent Talents in University of China under contract No. NCET-08-0455the Science and Technology Innovation Foundation of Northwestern Polytechnical University of Chinathe Doctorate Foundation of Northwestern Polytechnical University of China under contract No.CX201226.
文摘Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea (SCS) based on the simple ocean data assimilation (SODA) dataset. The spatio-temporal distributions of the MLD, the buoyancy flux (combining the NHF and the NFF) and the wind stress of the SCS were presented. Then using an oceanic vertical mixing model, the MLD after a certain time under the same initial conditions but various pairs of boundary conditions (the three factors) was simulated. Applying the MLR method to the results, regression equations which modeling the relationship between the simulated MLD and the three factors were calculated. The equations indicate that when the NHF was negative, it was the primary driver of the mixed layer deepening; and when the NHF was positive, the wind stress played a more important role than that of the NHF while the NFF had the least effect. When the NHF was positive, the relative quantitative effects of the wind stress, the NHF, and the NFF were about i0, 6 and 2. The above conclusions were applied to explaining the spatio-temporal distributions of the MLD in the SCS and thus proved to be valid.
基金Supported by the National Natural Science Foundation of China (70171008)
文摘A class of estimators of the mean survival time with interval censored data are studied by unbiased transformation method. The estimators are constructed based on the observations to ensure unbiasedness in the sense that the estimators in a certain class have the same expectation as the mean survival time. The estimators have good properties such as strong consistency (with the rate of O(n^-1/1 (log log n)^1/2)) and asymptotic normality. The application to linear regression is considered and the simulation reports are given.
基金the support of Key Laboratory of Chinese Medicine Preparation of Solid Dispersion,Gansu Longshenrongfa Pharmaceutical Industry Co.,Ltd.,Gansu Province,China
文摘As a mono-sodium salt form of alendronic acid,alendronate sodium presents multi-level ionization for the dissociation of its four hydroxyl groups.The dissociation constants of alendronate sodium were determined in this work by studying the piecewise linear relationship between volume of titrant and p H value based on acidbase potentiometric titration reaction.The distribution curves of alendronate sodium were drawn according to the determined p Ka values.There were 4 dissociation constants(pKa_1=2.43,pKa_2=7.55,pKa_3=10.80,pKa_4=11.99,respectively) of alendronate sodium,and 12 existing forms,of which 4 could be ignored,existing in different p H environments.
基金This research was funded by National Natural Science Foundation of China(Nos.31571920,61671378)。
文摘Tegillarca granosa(T.granosa)is susceptible to heavy metals,which may pose a threat to consumer health.Thus,healthy and polluted T.granosa should be distinguished quickly.This study aimed to rapidly identify heavy metal pollution by using laser-induced breakdown spectroscopy(LIBS)coupled with linear regression classification(LRC).Five types of T.granosa were studied,namely,Cd-,Zn-,Pb-contaminated,mixed contaminated,and control samples.Threshold method was applied to extract the significant variables from LIBS spectra.Then,LRC was used to classify the different types of T.granosa.Other classification models and feature selection methods were used for comparison.LRC was the best model,achieving an accuracy of 90.67%.Results indicated that LIBS combined with LRC is effective and feasible for T.granosa heavy metal detection.
基金Supported by the Ministry of Environmental Protection of China(No.2011467037)
文摘In current paper, a quantitative structure-activity relationship (QSAR) study was performed for the prediction of acute toxicity of aromatic amines. A set of 56 compounds was randomly divided into a training set of 46 compounds and a test set of 10 compounds. The electronic and topological descriptors computed by the Scigress package and Dragon software were used as predictor variables. Multiple linear regression (MLR) and support vector machine (SVM) were utilized to build the linear and nonlinear QSAR models, respectively. The obtained models with five descriptors show strong predictive ability. The linear model fits the training set with R2 = 0.71, with higher SVM values of R2 = 0.77. The validation results obtained from the test set indicate that the SVM model is comparable or superior to that obtained by MLR, both in terms of prediction ability and robustness.
文摘Support Vector-based learning methods are an important part of Computational Intelligence techniques. Recent efforts have been dealing with the problem of learning from very large datasets. This paper reviews the most commonly used formulations of support vector machines for regression (SVRs) aiming to emphasize its usability on large-scale applications. We review the general concept of support vector machines (SVMs), address the state-of-the-art on training methods SVMs, and explain the fundamental principle of SVRs. The most common learning methods for SVRs are introduced and linear programming-based SVR formulations are explained emphasizing its suitability for large-scale learning. Finally, this paper also discusses some open problems and current trends.