In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the...In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.展开更多
In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using ...In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using 0 and 1. So we can use the support vector machine regression method to fit the core-ratio value and predict the protein binding sites. We also design a new group of physical and chemical descriptors to characterize the binding sites. The new descriptors are more effective, with an averaging procedure used. Our test shows that much better prediction results can be obtained by the support vector regression (SVR) method than by the support vector classification method.展开更多
Radiometric normalization,as an essential step for multi-source and multi-temporal data processing,has received critical attention.Relative Radiometric Normalization(RRN)method has been primarily used for eliminating ...Radiometric normalization,as an essential step for multi-source and multi-temporal data processing,has received critical attention.Relative Radiometric Normalization(RRN)method has been primarily used for eliminating the radiometric inconsistency.The radiometric trans-forming relation between the subject image and the reference image is an essential aspect of RRN.Aimed at accurate radiometric transforming relation modeling,the learning-based nonlinear regression method,Support Vector machine Regression(SVR)is used for fitting the complicated radiometric transforming relation for the coarse-resolution data-referenced RRN.To evaluate the effectiveness of the proposed method,a series of experiments are performed,including two synthetic data experiments and one real data experiment.And the proposed method is compared with other methods that use linear regression,Artificial Neural Network(ANN)or Random Forest(RF)for radiometric transforming relation modeling.The results show that the proposed method performs well on fitting the radiometric transforming relation and could enhance the RRN performance.展开更多
The principle of the support vector regression machine(SVR) is first analysed. Then the new data-dependent kernel function is constructed from information geometry perspective. The current waveforms change regularly...The principle of the support vector regression machine(SVR) is first analysed. Then the new data-dependent kernel function is constructed from information geometry perspective. The current waveforms change regularly in accordance with the different horizontal offset when the rotational frequency of the high speed rotational arc sensor is in the range from 15 Hz to 30 Hz. The welding current data is pretreated by wavelet filtering, mean filtering and normalization treatment. The SVR model is constructed by making use of the evolvement laws, the decision function can be achieved by training the SVR and the seam offset can be identified. The experimental results show that the precision of the offset identification can be greatly improved by modifying the SVR and applying mean filteringfrom the longitudinal direction.展开更多
As the solutions of the least squares support vector regression machine (LS-SVRM) are not sparse, it leads to slow prediction speed and limits its applications. The defects of the ex- isting adaptive pruning algorit...As the solutions of the least squares support vector regression machine (LS-SVRM) are not sparse, it leads to slow prediction speed and limits its applications. The defects of the ex- isting adaptive pruning algorithm for LS-SVRM are that the training speed is slow, and the generalization performance is not satis- factory, especially for large scale problems. Hence an improved algorithm is proposed. In order to accelerate the training speed, the pruned data point and fast leave-one-out error are employed to validate the temporary model obtained after decremental learning. The novel objective function in the termination condition which in- volves the whole constraints generated by all training data points and three pruning strategies are employed to improve the generali- zation performance. The effectiveness of the proposed algorithm is tested on six benchmark datasets. The sparse LS-SVRM model has a faster training speed and better generalization performance.展开更多
The pruning algorithms for sparse least squares support vector regression machine are common methods, and easily com- prehensible, but the computational burden in the training phase is heavy due to the retraining in p...The pruning algorithms for sparse least squares support vector regression machine are common methods, and easily com- prehensible, but the computational burden in the training phase is heavy due to the retraining in performing the pruning process, which is not favorable for their applications. To this end, an im- proved scheme is proposed to accelerate sparse least squares support vector regression machine. A major advantage of this new scheme is based on the iterative methodology, which uses the previous training results instead of retraining, and its feasibility is strictly verified theoretically. Finally, experiments on bench- mark data sets corroborate a significant saving of the training time with the same number of support vectors and predictive accuracy compared with the original pruning algorithms, and this speedup scheme is also extended to classification problem.展开更多
In this work a Support Vector Machine Regression(SVMR) algorithm is used to calculate local magnitude(MI) using only five seconds of signal after the P wave onset of one three component seismic station. This algor...In this work a Support Vector Machine Regression(SVMR) algorithm is used to calculate local magnitude(MI) using only five seconds of signal after the P wave onset of one three component seismic station. This algorithm was trained with 863 records of historical earthquakes, where the input regression parameters were an exponential function of the waveform envelope estimated by least squares and the maximum value of the observed waveform for each component in a single station. Ten-fold cross validation was applied for a normalized polynomial kernel obtaining the mean absolute error for different exponents and complexity parameters. The local magnitude(MI) could be estimated with 0.19 units of mean absolute error. The proposed algorithm is easy to implement in hardware and may be used directly after the field seismological sensor to generate fast decisions at seismological control centers, increasing the possibility of having an effective reaction.展开更多
This article aims to assess health habits,safety behaviors,and anxiety factors in the community during the novel coronavirus disease(COVID-19)pandemic in Saudi Arabia based on primary data collected through a question...This article aims to assess health habits,safety behaviors,and anxiety factors in the community during the novel coronavirus disease(COVID-19)pandemic in Saudi Arabia based on primary data collected through a questionnaire with 320 respondents.In other words,this paper aims to provide empirical insights into the correlation and the correspondence between sociodemographic factors(gender,nationality,age,citizenship factors,income,and education),and psycho-behavioral effects on individuals in response to the emergence of this new pandemic.To focus on the interaction between these variables and their effects,we suggest different methods of analysis,comprising regression trees and support vector machine regression(SVMR)algorithms.According to the regression tree results,the age variable plays a predominant role in health habits,safety behaviors,and anxiety.The health habit index,which focuses on the extent of behavioral change toward the commitment to use the health and protection methods,is highly affected by gender and age factors.The average monthly income is also a relevant factor but has contrasting effects during the COVID-19 pandemic period.The results of the SVMR model reveal a strong positive effect of income,with R^(2) values of 99.59%,99.93%and 99.88%corresponding to health habits,safety behaviors,and anxiety.展开更多
The existing optimized performance prediction of carbon fiber protofilament process model is still unable to meet the production needs. A way of performance prediction on carbon fiber protofilament was presented based...The existing optimized performance prediction of carbon fiber protofilament process model is still unable to meet the production needs. A way of performance prediction on carbon fiber protofilament was presented based on support vector regression( SVR) which was optimized by an optimization algorithm combining simulated annealing algorithm and genetic algorithm( SAGA-SVR). To verify the accuracy of the model,the carbon fiber protofilament production test data were analyzed and compared with BP neural network( BPNN). The results show that SAGA-SVR can predict the performance parameters of the carbon fiber protofilament accurately.展开更多
Using object mathematical model of traditional control theory can not solve the forecasting problem of the chemical components of sintered ore.In order to control complicated chemical components in the manufacturing p...Using object mathematical model of traditional control theory can not solve the forecasting problem of the chemical components of sintered ore.In order to control complicated chemical components in the manufacturing process of sintered ore,some key techniques for intelligent forecasting of the chemical components of sintered ore are studied in this paper.A new intelligent forecasting system based on SVM is proposed and realized.The results show that the accuracy of predictive value of every component is more than 90%.The application of our system in related companies is for more than one year and has shown satisfactory results.展开更多
Adopting the regression SVM framework, this paper proposes a linguistically motivated feature engineering strategy to develop an MT evaluation metric with a better correlation with human assessments. In contrast to cu...Adopting the regression SVM framework, this paper proposes a linguistically motivated feature engineering strategy to develop an MT evaluation metric with a better correlation with human assessments. In contrast to current practices of "greedy" combination of all available features, six features are suggested according to the human intuition for translation quality. Then the contribution of linguistic features is examined and analyzed via a hill-climbing strategy. Experiments indicate that, compared to either the SVM-ranking model or the previous attempts on exhaustive linguistic features, the regression SVM model with six linguistic information based features generalizes across different datasets better, and augmenting these linguistic features with proper non-linguistic metrics can achieve additional improvements.展开更多
Soil bulk density(BD) is an important physical property and an essential factor for weight-to-volume conversion. However, BD is often missing from soil databases because its direct measurement is labor-intensive, time...Soil bulk density(BD) is an important physical property and an essential factor for weight-to-volume conversion. However, BD is often missing from soil databases because its direct measurement is labor-intensive, time-consuming, and sometimes impractical, particularly on a large scale. Therefore, pedotransfer functions(PTFs) have been developed over several decades to predict BD. Here, six previously revised PTFs(including five basic functions and stepwise multiple linear regression(SMLR)) and two new PTFs, partial least squares regression(PLSR) and support vector machine regression(SVMR), were used to develop BD-predicting PTFs for coastal soils in East China. Predictor variables included soil organic carbon(SOC) and particle size distribution(PSD). To compare the robustness and reliability of the PTFs used, the calibration and prediction processes were performed 1 000 times using the calibration and validation sets divided by a random sampling algorithm. The results showed that SOC was the most important predictor, and the revised PTFs performed reasonably although only SOC was included. The PSD data were useful for a better prediction of BD, and sand and clay fractions were the second and third most important properties for predicting BD. Compared to the other PTFs, the PLSR was shown to be slightly better for the study area(the average adjusted coefficient of determination for prediction was 0.581). These results suggest that PLSR with SOC and PSD data can be used to fill in the missing BD data in coastal soil databases and provide important information to estimate coastal carbon storage, which will further improve our understanding of sea-land interactions under the conditions of ongoing global warming.展开更多
In order to deal with the issue of huge computational cost very well in direct numerical simulation, the traditional response surface method (RSM) as a classical regression algorithm is used to approximate a functiona...In order to deal with the issue of huge computational cost very well in direct numerical simulation, the traditional response surface method (RSM) as a classical regression algorithm is used to approximate a functional relationship between the state variable and basic variables in reliability design. The algorithm has treated successfully some problems of implicit performance function in reliability analysis. However, its theoretical basis of empirical risk minimization narrows its range of applications for...展开更多
Soil organic matter (SOM) is important for plant growth and production. Conventional analyses of SOM are expensive and time consuming. Hyperspectral remote sensing is an alternative approach for SOM estimation. In thi...Soil organic matter (SOM) is important for plant growth and production. Conventional analyses of SOM are expensive and time consuming. Hyperspectral remote sensing is an alternative approach for SOM estimation. In this study, the diffuse reflectance spectra of soil samples from Qixia City, the Shandong Peninsula, China, were measured with an ASD FieldSpec 3 portable object spectrometer (Analytical Spectral Devices Inc., Boulder, USA). Raw spectral reflectance data were transformed using four methods: nine points weighted moving average (NWMA), NWMA with first derivative (NWMA + FD), NWMA with standard normal variate (NWMA + SNV), and NWMA with min-max standardization (NWMA + MS). These data were analyzed and correlated with SOM content. The evaluation model was established using support vector machine regression (SVM) with sensitive wavelengths. The results showed that NWMA + FD was the best of the four pretreatment methods. The sensitive wavelengths based on NWMA + FD were 917, 991, 1 007, 1 996, and 2 267 nm. The SVM model established with the above-mentioned five sensitive wavelengths was significant ( R 2 = 0.875, root mean square error (RMSE) = 0.107 g kg −1 for calibration set;R 2 = 0.853, RMSE = 0.097 g kg −1 for validation set). The results indicate that hyperspectral remote sensing can quickly and accurately predict SOM content in the brown forest soil areas of the Shandong Peninsula. This is a novel approach for rapid monitoring and accurate diagnosis of brown forest soil nutrients.展开更多
Hybrid reactive power compensation(HRPC)combines step-controlled shunt reactors and series compensation,and will be employed in ultra-high-voltage(UHV)power systems.The single-phase auto-reclosure characteristics of s...Hybrid reactive power compensation(HRPC)combines step-controlled shunt reactors and series compensation,and will be employed in ultra-high-voltage(UHV)power systems.The single-phase auto-reclosure characteristics of secondary arcs in systems with HRPC require further investigation.In this paper,both the arc-recalling voltage and subsidiary variations in arc current are investigated with and without HRPC.The frequency components of the secondary arc current and variations in arcing time are analyzed for various influential factors,such as the neutral reactor,arc resistance,fault location,degrees of compensation of HRPC,and the length of the transmission line.The non-dominated sorting genetic algorithm II(NSGA-II)and support vector machine regression are combined to create a multi-variable dependent forecasting algorithm to predict the characteristics of the secondary arc in UHV systems with HRPC.This paper provides a theoretical reference for optimizing the parameters of HRPC,and for developing adaptive auto-reclosure schemes and protection equipment.展开更多
In order to establish an adaptive turbo-shaft engine model with high accuracy, a new modeling method based on parameter selection (PS) algorithm and multi-input multi-output recursive reduced least square support ve...In order to establish an adaptive turbo-shaft engine model with high accuracy, a new modeling method based on parameter selection (PS) algorithm and multi-input multi-output recursive reduced least square support vector regression (MRR-LSSVR) machine is proposed. Firstly, the PS algorithm is designed to choose the most reasonable inputs of the adaptive module. During this process, a wrapper criterion based on least square support vector regression (LSSVR) machine is adopted, which can not only reduce computational complexity but also enhance generalization performance. Secondly, with the input variables determined by the PS algorithm, a mapping model of engine parameter estimation is trained off-line using MRR-LSSVR, which has a satisfying accuracy within 5&. Finally, based on a numerical simulation platform of an integrated helicopter/ turbo-shaft engine system, an adaptive turbo-shaft engine model is developed and tested in a certain flight envelope. Under the condition of single or multiple engine components being degraded, many simulation experiments are carried out, and the simulation results show the effectiveness and validity of the proposed adaptive modeling method.展开更多
Combustion kinetic parameters(i.e.,activation energy and frequency factor) of coal have been proven to relate closely to coal properties;however,the quantitative relationship between them still requires further study....Combustion kinetic parameters(i.e.,activation energy and frequency factor) of coal have been proven to relate closely to coal properties;however,the quantitative relationship between them still requires further study.This paper adopts a support vector regression machine(SVR) to generate the models of the non-linear relationship between combustion kinetic parameters and coal quality.Kinetic analyses on the thermo-gravimetry(TG) data of 80 coal samples were performed to prepare training data and testing data for the SVR.The models developed were used in the estimation of the combustion kinetic parameters of ten testing samples.The predicted results showed that the root mean square errors(RMSEs) were 2.571 for the activation energy and 0.565 for the frequency factor in logarithmic form,respectively.TG curves defined by predicted kinetic parameters were fitted to the experimental data with a high degree of precision.展开更多
In this work, two chemometrics methods are applied for the modeling and prediction of electrophoretic mobilities of some organic and inorganic compounds. The successive projection algorithm, feature selection (SPA) ...In this work, two chemometrics methods are applied for the modeling and prediction of electrophoretic mobilities of some organic and inorganic compounds. The successive projection algorithm, feature selection (SPA) strategy, is used as the descriptor selection and model development method. Then, the support vector machine (SVM) and multiple linear regression (MLR) model are utilized to construct the non-linear and linear quantitative structure-property relationship models. The results obtained using the SVM model are compared with those obtained using MLR reveal that the SVM model is of much better predictive value than the MLR one. The root-mean-square errors for the training set and the test set for the SVM model were 0.1911 and 0.2569, respectively, while by the MLR model, they were 0.4908 and 0.6494, respectively. The results show that the SVM model drastically enhances the ability of prediction in QSPR studies and is superior to the MLR model.展开更多
Credit scoring has become a critical and challenging management science issue as the credit industry has been facing stiffer competition in recent years. Many classification methods have been suggested to tackle this ...Credit scoring has become a critical and challenging management science issue as the credit industry has been facing stiffer competition in recent years. Many classification methods have been suggested to tackle this problem in the literature. In this paper, we investigate the performance of various credit scoring models and the corresponding credit risk cost for three real-life credit scoring data sets. Besides the well-known classification algorithms (e.g. linear discriminant analysis, logistic regression, neural networks and k-nearest neighbor), we also investigate the suitability and performance of some recently proposed, advanced data mining techniques such as support vector machines (SVMs), classification and regression tree (CART), and multivariate adaptive regression splines (MARS). The performance is assessed by using the classification accuracy and cost of credit scoring errors. The experiment results show that SVM, MARS, logistic regression and neural networks yield a very good performance. However, CART and MARS's explanatory capability outperforms the other methods.展开更多
基金Project supported by the National Natural Science Foundation of China (Grant No 60573065)the Natural Science Foundation of Shandong Province,China (Grant No Y2007G33)the Key Subject Research Foundation of Shandong Province,China(Grant No XTD0708)
文摘In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.
基金Project supported by the National Natural Science Foundation of China (Grant Nos. 10674172 and 10874229)
文摘In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using 0 and 1. So we can use the support vector machine regression method to fit the core-ratio value and predict the protein binding sites. We also design a new group of physical and chemical descriptors to characterize the binding sites. The new descriptors are more effective, with an averaging procedure used. Our test shows that much better prediction results can be obtained by the support vector regression (SVR) method than by the support vector classification method.
基金This research was funded by the National Natural Science Fund of China[grant number 41701415]Science fund project of Wuhan Institute of Technology[grant number K201724]Science and Technology Development Funds Project of Department of Transportation of Hubei Province[grant number 201900001].
文摘Radiometric normalization,as an essential step for multi-source and multi-temporal data processing,has received critical attention.Relative Radiometric Normalization(RRN)method has been primarily used for eliminating the radiometric inconsistency.The radiometric trans-forming relation between the subject image and the reference image is an essential aspect of RRN.Aimed at accurate radiometric transforming relation modeling,the learning-based nonlinear regression method,Support Vector machine Regression(SVR)is used for fitting the complicated radiometric transforming relation for the coarse-resolution data-referenced RRN.To evaluate the effectiveness of the proposed method,a series of experiments are performed,including two synthetic data experiments and one real data experiment.And the proposed method is compared with other methods that use linear regression,Artificial Neural Network(ANN)or Random Forest(RF)for radiometric transforming relation modeling.The results show that the proposed method performs well on fitting the radiometric transforming relation and could enhance the RRN performance.
基金Supported by National Natural Science Foundation of China( No. 50705030).
文摘The principle of the support vector regression machine(SVR) is first analysed. Then the new data-dependent kernel function is constructed from information geometry perspective. The current waveforms change regularly in accordance with the different horizontal offset when the rotational frequency of the high speed rotational arc sensor is in the range from 15 Hz to 30 Hz. The welding current data is pretreated by wavelet filtering, mean filtering and normalization treatment. The SVR model is constructed by making use of the evolvement laws, the decision function can be achieved by training the SVR and the seam offset can be identified. The experimental results show that the precision of the offset identification can be greatly improved by modifying the SVR and applying mean filteringfrom the longitudinal direction.
基金supported by the National Natural Science Foundation of China (61074127)
文摘As the solutions of the least squares support vector regression machine (LS-SVRM) are not sparse, it leads to slow prediction speed and limits its applications. The defects of the ex- isting adaptive pruning algorithm for LS-SVRM are that the training speed is slow, and the generalization performance is not satis- factory, especially for large scale problems. Hence an improved algorithm is proposed. In order to accelerate the training speed, the pruned data point and fast leave-one-out error are employed to validate the temporary model obtained after decremental learning. The novel objective function in the termination condition which in- volves the whole constraints generated by all training data points and three pruning strategies are employed to improve the generali- zation performance. The effectiveness of the proposed algorithm is tested on six benchmark datasets. The sparse LS-SVRM model has a faster training speed and better generalization performance.
基金supported by the National Natural Science Foundation of China(50576033)
文摘The pruning algorithms for sparse least squares support vector regression machine are common methods, and easily com- prehensible, but the computational burden in the training phase is heavy due to the retraining in performing the pruning process, which is not favorable for their applications. To this end, an im- proved scheme is proposed to accelerate sparse least squares support vector regression machine. A major advantage of this new scheme is based on the iterative methodology, which uses the previous training results instead of retraining, and its feasibility is strictly verified theoretically. Finally, experiments on bench- mark data sets corroborate a significant saving of the training time with the same number of support vectors and predictive accuracy compared with the original pruning algorithms, and this speedup scheme is also extended to classification problem.
文摘In this work a Support Vector Machine Regression(SVMR) algorithm is used to calculate local magnitude(MI) using only five seconds of signal after the P wave onset of one three component seismic station. This algorithm was trained with 863 records of historical earthquakes, where the input regression parameters were an exponential function of the waveform envelope estimated by least squares and the maximum value of the observed waveform for each component in a single station. Ten-fold cross validation was applied for a normalized polynomial kernel obtaining the mean absolute error for different exponents and complexity parameters. The local magnitude(MI) could be estimated with 0.19 units of mean absolute error. The proposed algorithm is easy to implement in hardware and may be used directly after the field seismological sensor to generate fast decisions at seismological control centers, increasing the possibility of having an effective reaction.
文摘This article aims to assess health habits,safety behaviors,and anxiety factors in the community during the novel coronavirus disease(COVID-19)pandemic in Saudi Arabia based on primary data collected through a questionnaire with 320 respondents.In other words,this paper aims to provide empirical insights into the correlation and the correspondence between sociodemographic factors(gender,nationality,age,citizenship factors,income,and education),and psycho-behavioral effects on individuals in response to the emergence of this new pandemic.To focus on the interaction between these variables and their effects,we suggest different methods of analysis,comprising regression trees and support vector machine regression(SVMR)algorithms.According to the regression tree results,the age variable plays a predominant role in health habits,safety behaviors,and anxiety.The health habit index,which focuses on the extent of behavioral change toward the commitment to use the health and protection methods,is highly affected by gender and age factors.The average monthly income is also a relevant factor but has contrasting effects during the COVID-19 pandemic period.The results of the SVMR model reveal a strong positive effect of income,with R^(2) values of 99.59%,99.93%and 99.88%corresponding to health habits,safety behaviors,and anxiety.
基金the Key Project of National Natural Science Foundation of China(No.61134009)Program for Changjiang Scholars and Innovation Research Team in University from the Ministry of Education,China(No.IRT1220)+1 种基金Specialized Research Fund for Shanghai Leading Talents,Project of the Shanghai Committee of Science and Technology,China(No.13JC1407500)the Fundamental Research Funds for the Central Universities,China(No.2232012A3-04)
文摘The existing optimized performance prediction of carbon fiber protofilament process model is still unable to meet the production needs. A way of performance prediction on carbon fiber protofilament was presented based on support vector regression( SVR) which was optimized by an optimization algorithm combining simulated annealing algorithm and genetic algorithm( SAGA-SVR). To verify the accuracy of the model,the carbon fiber protofilament production test data were analyzed and compared with BP neural network( BPNN). The results show that SAGA-SVR can predict the performance parameters of the carbon fiber protofilament accurately.
基金Supported by Key Science and Technology Project of Wuhan(No. 20106062327)Self-determined and Innovative Research Funds of WUT (No.2010-YB-20)
文摘Using object mathematical model of traditional control theory can not solve the forecasting problem of the chemical components of sintered ore.In order to control complicated chemical components in the manufacturing process of sintered ore,some key techniques for intelligent forecasting of the chemical components of sintered ore are studied in this paper.A new intelligent forecasting system based on SVM is proposed and realized.The results show that the accuracy of predictive value of every component is more than 90%.The application of our system in related companies is for more than one year and has shown satisfactory results.
基金Supported by the National Natural Science Foundation of China under Grant Nos.60773066 and 60736014the National High Technology Development 863 Program of China under Grant No.2006AA010108the Natural Scientific Research Innovation Foundation in Harbin Institute of Technology under Grant No.HIT.NSFIR.20009070
文摘Adopting the regression SVM framework, this paper proposes a linguistically motivated feature engineering strategy to develop an MT evaluation metric with a better correlation with human assessments. In contrast to current practices of "greedy" combination of all available features, six features are suggested according to the human intuition for translation quality. Then the contribution of linguistic features is examined and analyzed via a hill-climbing strategy. Experiments indicate that, compared to either the SVM-ranking model or the previous attempts on exhaustive linguistic features, the regression SVM model with six linguistic information based features generalizes across different datasets better, and augmenting these linguistic features with proper non-linguistic metrics can achieve additional improvements.
基金supported by the National Natural Science Foundation of China (Nos. 41877004 and 42130405)the China Scholarship Council (Nos. 201809040007 and 201808320124)。
文摘Soil bulk density(BD) is an important physical property and an essential factor for weight-to-volume conversion. However, BD is often missing from soil databases because its direct measurement is labor-intensive, time-consuming, and sometimes impractical, particularly on a large scale. Therefore, pedotransfer functions(PTFs) have been developed over several decades to predict BD. Here, six previously revised PTFs(including five basic functions and stepwise multiple linear regression(SMLR)) and two new PTFs, partial least squares regression(PLSR) and support vector machine regression(SVMR), were used to develop BD-predicting PTFs for coastal soils in East China. Predictor variables included soil organic carbon(SOC) and particle size distribution(PSD). To compare the robustness and reliability of the PTFs used, the calibration and prediction processes were performed 1 000 times using the calibration and validation sets divided by a random sampling algorithm. The results showed that SOC was the most important predictor, and the revised PTFs performed reasonably although only SOC was included. The PSD data were useful for a better prediction of BD, and sand and clay fractions were the second and third most important properties for predicting BD. Compared to the other PTFs, the PLSR was shown to be slightly better for the study area(the average adjusted coefficient of determination for prediction was 0.581). These results suggest that PLSR with SOC and PSD data can be used to fill in the missing BD data in coastal soil databases and provide important information to estimate coastal carbon storage, which will further improve our understanding of sea-land interactions under the conditions of ongoing global warming.
基金National High-tech Research and Development Pro-gram (2006AA04Z405)
文摘In order to deal with the issue of huge computational cost very well in direct numerical simulation, the traditional response surface method (RSM) as a classical regression algorithm is used to approximate a functional relationship between the state variable and basic variables in reliability design. The algorithm has treated successfully some problems of implicit performance function in reliability analysis. However, its theoretical basis of empirical risk minimization narrows its range of applications for...
基金supported by the National Nature Science Foundation of China(Nos.41671346 and41301482)the Shandong Province Natural Science Fund of China(No.ZR2012DM007)
文摘Soil organic matter (SOM) is important for plant growth and production. Conventional analyses of SOM are expensive and time consuming. Hyperspectral remote sensing is an alternative approach for SOM estimation. In this study, the diffuse reflectance spectra of soil samples from Qixia City, the Shandong Peninsula, China, were measured with an ASD FieldSpec 3 portable object spectrometer (Analytical Spectral Devices Inc., Boulder, USA). Raw spectral reflectance data were transformed using four methods: nine points weighted moving average (NWMA), NWMA with first derivative (NWMA + FD), NWMA with standard normal variate (NWMA + SNV), and NWMA with min-max standardization (NWMA + MS). These data were analyzed and correlated with SOM content. The evaluation model was established using support vector machine regression (SVM) with sensitive wavelengths. The results showed that NWMA + FD was the best of the four pretreatment methods. The sensitive wavelengths based on NWMA + FD were 917, 991, 1 007, 1 996, and 2 267 nm. The SVM model established with the above-mentioned five sensitive wavelengths was significant ( R 2 = 0.875, root mean square error (RMSE) = 0.107 g kg −1 for calibration set;R 2 = 0.853, RMSE = 0.097 g kg −1 for validation set). The results indicate that hyperspectral remote sensing can quickly and accurately predict SOM content in the brown forest soil areas of the Shandong Peninsula. This is a novel approach for rapid monitoring and accurate diagnosis of brown forest soil nutrients.
文摘Hybrid reactive power compensation(HRPC)combines step-controlled shunt reactors and series compensation,and will be employed in ultra-high-voltage(UHV)power systems.The single-phase auto-reclosure characteristics of secondary arcs in systems with HRPC require further investigation.In this paper,both the arc-recalling voltage and subsidiary variations in arc current are investigated with and without HRPC.The frequency components of the secondary arc current and variations in arcing time are analyzed for various influential factors,such as the neutral reactor,arc resistance,fault location,degrees of compensation of HRPC,and the length of the transmission line.The non-dominated sorting genetic algorithm II(NSGA-II)and support vector machine regression are combined to create a multi-variable dependent forecasting algorithm to predict the characteristics of the secondary arc in UHV systems with HRPC.This paper provides a theoretical reference for optimizing the parameters of HRPC,and for developing adaptive auto-reclosure schemes and protection equipment.
基金co-supported by Aeronautical Science Foundation of China (No. 2010ZB52011)Funding of Jiangsu Innovation Program for Graduate Education (No.CXLX11_0213)
文摘In order to establish an adaptive turbo-shaft engine model with high accuracy, a new modeling method based on parameter selection (PS) algorithm and multi-input multi-output recursive reduced least square support vector regression (MRR-LSSVR) machine is proposed. Firstly, the PS algorithm is designed to choose the most reasonable inputs of the adaptive module. During this process, a wrapper criterion based on least square support vector regression (LSSVR) machine is adopted, which can not only reduce computational complexity but also enhance generalization performance. Secondly, with the input variables determined by the PS algorithm, a mapping model of engine parameter estimation is trained off-line using MRR-LSSVR, which has a satisfying accuracy within 5&. Finally, based on a numerical simulation platform of an integrated helicopter/ turbo-shaft engine system, an adaptive turbo-shaft engine model is developed and tested in a certain flight envelope. Under the condition of single or multiple engine components being degraded, many simulation experiments are carried out, and the simulation results show the effectiveness and validity of the proposed adaptive modeling method.
文摘Combustion kinetic parameters(i.e.,activation energy and frequency factor) of coal have been proven to relate closely to coal properties;however,the quantitative relationship between them still requires further study.This paper adopts a support vector regression machine(SVR) to generate the models of the non-linear relationship between combustion kinetic parameters and coal quality.Kinetic analyses on the thermo-gravimetry(TG) data of 80 coal samples were performed to prepare training data and testing data for the SVR.The models developed were used in the estimation of the combustion kinetic parameters of ten testing samples.The predicted results showed that the root mean square errors(RMSEs) were 2.571 for the activation energy and 0.565 for the frequency factor in logarithmic form,respectively.TG curves defined by predicted kinetic parameters were fitted to the experimental data with a high degree of precision.
文摘In this work, two chemometrics methods are applied for the modeling and prediction of electrophoretic mobilities of some organic and inorganic compounds. The successive projection algorithm, feature selection (SPA) strategy, is used as the descriptor selection and model development method. Then, the support vector machine (SVM) and multiple linear regression (MLR) model are utilized to construct the non-linear and linear quantitative structure-property relationship models. The results obtained using the SVM model are compared with those obtained using MLR reveal that the SVM model is of much better predictive value than the MLR one. The root-mean-square errors for the training set and the test set for the SVM model were 0.1911 and 0.2569, respectively, while by the MLR model, they were 0.4908 and 0.6494, respectively. The results show that the SVM model drastically enhances the ability of prediction in QSPR studies and is superior to the MLR model.
基金This work was supported in part by National Science Foundation of China under Grant No. 70171015
文摘Credit scoring has become a critical and challenging management science issue as the credit industry has been facing stiffer competition in recent years. Many classification methods have been suggested to tackle this problem in the literature. In this paper, we investigate the performance of various credit scoring models and the corresponding credit risk cost for three real-life credit scoring data sets. Besides the well-known classification algorithms (e.g. linear discriminant analysis, logistic regression, neural networks and k-nearest neighbor), we also investigate the suitability and performance of some recently proposed, advanced data mining techniques such as support vector machines (SVMs), classification and regression tree (CART), and multivariate adaptive regression splines (MARS). The performance is assessed by using the classification accuracy and cost of credit scoring errors. The experiment results show that SVM, MARS, logistic regression and neural networks yield a very good performance. However, CART and MARS's explanatory capability outperforms the other methods.