Near infrared reflectance spectroscopy (NIRS), a non-destructive measurement technique, was combined with partial least squares regression discrimiant analysis (PLS-DA) to discriminate the transgenic (TCTP and mi...Near infrared reflectance spectroscopy (NIRS), a non-destructive measurement technique, was combined with partial least squares regression discrimiant analysis (PLS-DA) to discriminate the transgenic (TCTP and mi166) and wild type (Zhonghua 11) rice. Furthermore, rice lines transformed with protein gene (OsTCTP) and regulation gene (Osmi166) were also discriminated by the NIRS method. The performances of PLS-DA in spectral ranges of 4 000-8 000 cm-1 and 4 000-10 000 cm-1 were compared to obtain the optimal spectral range. As a result, the transgenic and wild type rice were distinguished from each other in the range of 4 000-10 000 cm-1, and the correct classification rate was 100.0% in the validation test. The transgenic rice TCTP and mi166 were also distinguished from each other in the range of 4 000-10 000 cm-1, and the correct classification rate was also 100.0%. In conclusion, NIRS combined with PLS-DA can be used for the discrimination of transgenic rice.展开更多
Estimating wheat grain protein content by remote sensing is important for assessing wheat quality at maturity and making grains harvest and purchase policies. However, spatial variability of soil condition, temperatur...Estimating wheat grain protein content by remote sensing is important for assessing wheat quality at maturity and making grains harvest and purchase policies. However, spatial variability of soil condition, temperature, and precipitation will affect grain protein contents and these factors usually cannot be monitored accurately by remote sensing data from single image. In this research, the relationships between wheat protein content at maturity and wheat agronomic parameters at different growing stages were analyzed and multi-temporal images of Landsat TM were used to estimate grain protein content by partial least squares regression. Experiment data were acquired in the suburb of Beijing during a 2-yr experiment in the period from 2003 to 2004. Determination coefficient, average deviation of self-modeling, and deviation of cross- validation were employed to assess the estimation accuracy of wheat grain protein content. Their values were 0.88, 1.30%, 3.81% and 0.72, 5.22%, 12.36% for 2003 and 2004, respectively. The research laid an agronomic foundation for GPC (grain protein content) estimation by multi-temporal remote sensing. The results showed that it is feasible to estimate GPC of wheat from multi-temporal remote sensing data in large area.展开更多
The water distribution system of one residential district in Tianjin is taken as an example to analyze the changes of water quality.Partial least squares(PLS) regression model,in which the turbidity and Fe are regarde...The water distribution system of one residential district in Tianjin is taken as an example to analyze the changes of water quality.Partial least squares(PLS) regression model,in which the turbidity and Fe are regarded as control objectives,is used to establish the statistical model.The experimental results indicate that the PLS regression model has good predicted results of water quality compared with the monitored data.The percentages of absolute relative error(below 15%,20%,30%) are 44.4%,66.7%,100%(turbidity) and 33.3%,44.4%,77.8%(Fe) on the 4th sampling point;77.8%,88.9%,88.9%(turbidity) and 44.4%,55.6%,66.7%(Fe) on the 5th sampling point.展开更多
The computer auxiliary partial least squares is introduced to simultaneously determine the contents of Deoxyschizandin, Schisandrin, r-Schisandrin in the extracted solution of wuweizi. Regression analysis of the exper...The computer auxiliary partial least squares is introduced to simultaneously determine the contents of Deoxyschizandin, Schisandrin, r-Schisandrin in the extracted solution of wuweizi. Regression analysis of the experimental results shows that the average recovery of each component is all in the range from 98.9% to 110.3% , which means the partial least squares regression spectrophotometry can circumvent the overlappirtg of absorption spectrums of mlulti-components, so that sctisfactory results can be obtained without any scrapple pre-separation.展开更多
Partial least squares(PLS) regression is an important linear regression method that efficiently addresses the multiple correlation problem by combining principal component analysis and multiple regression. In this pap...Partial least squares(PLS) regression is an important linear regression method that efficiently addresses the multiple correlation problem by combining principal component analysis and multiple regression. In this paper, we present a quantum partial least squares(QPLS) regression algorithm. To solve the high time complexity of the PLS regression, we design a quantum eigenvector search method to speed up principal components and regression parameters construction. Meanwhile, we give a density matrix product method to avoid multiple access to quantum random access memory(QRAM)during building residual matrices. The time and space complexities of the QPLS regression are logarithmic in the independent variable dimension n, the dependent variable dimension w, and the number of variables m. This algorithm achieves exponential speed-ups over the PLS regression on n, m, and w. In addition, the QPLS regression inspires us to explore more potential quantum machine learning applications in future works.展开更多
Pseudomonas spp.and Enterobacteriaceae are dominant spoilage bacteria in chicken during cold storage(0°C-4°C).In this study,high resolution spectra in the range of 900-1700 nm were acquired and preprocessed ...Pseudomonas spp.and Enterobacteriaceae are dominant spoilage bacteria in chicken during cold storage(0°C-4°C).In this study,high resolution spectra in the range of 900-1700 nm were acquired and preprocessed using Savitzky-Golay convolution smoothing(SGCS),standard normal variate(SNV)and multiplicative scatter correction(MSC),respectively,and then mined using partial least squares(PLS)algorithm to relate to the total counts of Pseudomonas spp.and Enterobacteriaceae(PEC)of fresh chicken breasts to predict PEC rapidly.The results showed that with full 900-1700 nm range wavelength,MSC-PLS model built with MSC spectra performed better than PLS models with other spectra(RAW-PLS,SGCS-PLS,SNV-PLS),with correlation coefficient(RP)of 0.954,root mean square error of prediction(RMSEP)of 0.396 log10 CFU/g and residual predictive deviation(RPD)of 3.33 in prediction set.Based on the 12 optimal wavelengths(902.2 nm,905.5 nm,923.6 nm,938.4 nm,946.7 nm,1025.7 nm,1124.4 nm,1211.6 nm,1269.2 nm,1653.7 nm,1691.8 nm and 1693.4 nm)selected from MSC spectra by successive projections algorithm(SPA),SPA-MSC-PLS model had RP of 0.954,RMSEP of 0.397 log10 CFU/g and RPD of 3.32,similar to MSC-PLS model.The overall study indicated that NIR spectra combined with PLS algorithm could be used to detect the PEC of chicken flesh in a rapid and non-destructive way.展开更多
This study presented the application of partial least squares regression (PLSR) in estimating daily pan evaporation by utilizing the unique feature of PLSR in eliminating collinearity issues in predictor variables. ...This study presented the application of partial least squares regression (PLSR) in estimating daily pan evaporation by utilizing the unique feature of PLSR in eliminating collinearity issues in predictor variables. The climate variables and daily pan evaporation data measured at two weather stations located near Elephant Butte Reservoir, New Mexico, USA and a weather station located in Shanshan County, Xinjiang, China were used in the study. The nonlinear relationship between climate variables and daily pan evaporation was successfully modeled using PLSR approach by solving collinearity that exists in the climate variables. The modeling results were compared to artificial neural networks (ANN) models with the same input variables. The resuits showed that the nonlinear equations developed using PLSR has similar performance with complex ANN approach for the study sites. The modeling process was straightforward and the equations were simpler and more explicit than the ANN black-box models.展开更多
As the solutions of the least squares support vector regression machine (LS-SVRM) are not sparse, it leads to slow prediction speed and limits its applications. The defects of the ex- isting adaptive pruning algorit...As the solutions of the least squares support vector regression machine (LS-SVRM) are not sparse, it leads to slow prediction speed and limits its applications. The defects of the ex- isting adaptive pruning algorithm for LS-SVRM are that the training speed is slow, and the generalization performance is not satis- factory, especially for large scale problems. Hence an improved algorithm is proposed. In order to accelerate the training speed, the pruned data point and fast leave-one-out error are employed to validate the temporary model obtained after decremental learning. The novel objective function in the termination condition which in- volves the whole constraints generated by all training data points and three pruning strategies are employed to improve the generali- zation performance. The effectiveness of the proposed algorithm is tested on six benchmark datasets. The sparse LS-SVRM model has a faster training speed and better generalization performance.展开更多
To overcome the disadvantage that the standard least squares support vector regression(LS-SVR) algorithm is not suitable to multiple-input multiple-output(MIMO) system modelling directly,an improved LS-SVR algorithm w...To overcome the disadvantage that the standard least squares support vector regression(LS-SVR) algorithm is not suitable to multiple-input multiple-output(MIMO) system modelling directly,an improved LS-SVR algorithm which was defined as multi-output least squares support vector regression(MLSSVR) was put forward by adding samples' absolute errors in objective function and applied to flatness intelligent control.To solve the poor-precision problem of the control scheme based on effective matrix in flatness control,the predictive control was introduced into the control system and the effective matrix-predictive flatness control method was proposed by combining the merits of the two methods.Simulation experiment was conducted on 900HC reversible cold roll.The performance of effective matrix method and the effective matrix-predictive control method were compared,and the results demonstrate the validity of the effective matrix-predictive control method.展开更多
A method of multiple outputs least squares support vector regression (LS-SVR) was developed and described in detail, with the radial basis function (RBF) as the kernel function. The method was applied to predict t...A method of multiple outputs least squares support vector regression (LS-SVR) was developed and described in detail, with the radial basis function (RBF) as the kernel function. The method was applied to predict the future state of the power-shift steering transmission (PSST). A prediction model of PSST was gotten with multiple outputs LS-SVR. The model performance was greatly influenced by the penalty parameter γ and kernel parameter σ2 which were optimized using cross validation method. The training and prediction of the model were done with spectrometric oil analysis data. The predictive and actual values were compared and a fault in the second PSST was found. The research proved that this method had good accuracy in PSST fault prediction, and any possible problem in PSST could be found through a comparative analysis.展开更多
The pruning algorithms for sparse least squares support vector regression machine are common methods, and easily com- prehensible, but the computational burden in the training phase is heavy due to the retraining in p...The pruning algorithms for sparse least squares support vector regression machine are common methods, and easily com- prehensible, but the computational burden in the training phase is heavy due to the retraining in performing the pruning process, which is not favorable for their applications. To this end, an im- proved scheme is proposed to accelerate sparse least squares support vector regression machine. A major advantage of this new scheme is based on the iterative methodology, which uses the previous training results instead of retraining, and its feasibility is strictly verified theoretically. Finally, experiments on bench- mark data sets corroborate a significant saving of the training time with the same number of support vectors and predictive accuracy compared with the original pruning algorithms, and this speedup scheme is also extended to classification problem.展开更多
During the course of calculating the rice evapotranspiration using weather factors,we often find that some independent variables have multiple correlation.The phenomena can lead to the traditional multivariate regress...During the course of calculating the rice evapotranspiration using weather factors,we often find that some independent variables have multiple correlation.The phenomena can lead to the traditional multivariate regression model which based on least square method distortion.And the stability of the model will be lost.The model will be built based on partial least square regression in the paper,through applying the idea of main component analyze and typical correlation analyze,the writer picks up some component from original material.Thus,the writer builds up the model of rice evapotranspiration to solve the multiple correlation among the independent variables (some weather factors).At last,the writer analyses the model in some parts,and gains the satisfied result.展开更多
Laser-induced breakdown spectroscopy(LIBS)has become a widely used atomic spectroscopic technique for rapid coal analysis.However,the vast amount of spectral information in LIBS contains signal uncertainty,which can a...Laser-induced breakdown spectroscopy(LIBS)has become a widely used atomic spectroscopic technique for rapid coal analysis.However,the vast amount of spectral information in LIBS contains signal uncertainty,which can affect its quantification performance.In this work,we propose a hybrid variable selection method to improve the performance of LIBS quantification.Important variables are first identified using Pearson's correlation coefficient,mutual information,least absolute shrinkage and selection operator(LASSO)and random forest,and then filtered and combined with empirical variables related to fingerprint elements of coal ash content.Subsequently,these variables are fed into a partial least squares regression(PLSR).Additionally,in some models,certain variables unrelated to ash content are removed manually to study the impact of variable deselection on model performance.The proposed hybrid strategy was tested on three LIBS datasets for quantitative analysis of coal ash content and compared with the corresponding data-driven baseline method.It is significantly better than the variable selection only method based on empirical knowledge and in most cases outperforms the baseline method.The results showed that on all three datasets the hybrid strategy for variable selection combining empirical knowledge and data-driven algorithms achieved the lowest root mean square error of prediction(RMSEP)values of 1.605,3.478 and 1.647,respectively,which were significantly lower than those obtained from multiple linear regression using only 12 empirical variables,which are 1.959,3.718 and 2.181,respectively.The LASSO-PLSR model with empirical support and 20 selected variables exhibited a significantly improved performance after variable deselection,with RMSEP values dropping from 1.635,3.962 and 1.647 to 1.483,3.086 and 1.567,respectively.Such results demonstrate that using empirical knowledge as a support for datadriven variable selection can be a viable approach to improve the accuracy and reliability of LIBS quantification.展开更多
In order to deal with the issue of huge computational cost very well in direct numerical simulation, the traditional response surface method (RSM) as a classical regression algorithm is used to approximate a functiona...In order to deal with the issue of huge computational cost very well in direct numerical simulation, the traditional response surface method (RSM) as a classical regression algorithm is used to approximate a functional relationship between the state variable and basic variables in reliability design. The algorithm has treated successfully some problems of implicit performance function in reliability analysis. However, its theoretical basis of empirical risk minimization narrows its range of applications for...展开更多
“Breeding by design” for pure lines may be achieved by construction of an additive QTL-allele matrix in a germplasm panel or breeding population, but this option is not available for hybrids, where both additive and...“Breeding by design” for pure lines may be achieved by construction of an additive QTL-allele matrix in a germplasm panel or breeding population, but this option is not available for hybrids, where both additive and dominance QTL-allele matrices must be constructed. In this study, a hybrid-QTL identification approach, designated PLSRGA, using partial least squares regression(PLSR) for model fitting integrated with a genetic algorithm(GA) for variable selection based on a multi-locus, multi-allele model is described for additive and dominance QTL-allele detection in a diallel hybrid population(DHP). The PLSRGA was shown by simulation experiments to be superior to single-marker analysis and was then used for QTL-allele identification in a soybean DPH yield experiment with eight parents. Twenty-eight main-effect QTL with 138 alleles and nine QTL × environment QTL with 46 alleles were identified, with respective contributions of 61.8% and 23.5% of phenotypic variation. Main-effect additive and dominance QTL-allele matrices were established as a compact form of the DHP genetic structure. The mechanism of heterosis superior-to-parents(or superior-to-parents heterosis, SPH) was explored and might be explained by a complementary locus-set composed of OD+(showing positive over-dominance, most often), PD+(showing positive partial-to-complete dominance, less often) and HA+(showing positive homozygous additivity, occasionally) loci, depending on the parental materials. Any locus-type, whether OD+, PD + and HA+, could be the best genotype of a locus. All hybrids showed various numbers of better or best genotypes at many but not necessarily all loci, indicating further SPH improvement. Based on the additive/dominance QTL-allele matrices, the best hybrid genotype was predicted, and a hybrid improvement approach is suggested. PLSRGA is powerful for hybrid QTL-allele detection and cross-SPH improvement.展开更多
Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-ti...Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-tic EEG signals and develop artificial intelligence(AI)-assist recognition,a multi-view transfer learning(MVTL-LSR)algorithm based on least squares regression is proposed in this study.Compared with most existing multi-view transfer learning algorithms,MVTL-LSR has two merits:(1)Since traditional transfer learning algorithms leverage knowledge from different sources,which poses a significant risk to data privacy.Therefore,we develop a knowledge transfer mechanism that can protect the security of source domain data while guaranteeing performance.(2)When utilizing multi-view data,we embed view weighting and manifold regularization into the transfer framework to measure the views’strengths and weaknesses and improve generalization ability.In the experimental studies,12 different simulated multi-view&transfer scenarios are constructed from epileptic EEG signals licensed and provided by the Uni-versity of Bonn,Germany.Extensive experimental results show that MVTL-LSR outperforms baselines.The source code will be available on https://github.com/didid5/MVTL-LSR.展开更多
Background Fiber maturity is a key cotton quality property,and its variability in a sample impacts fiber processing and dyeing performance.Currently,the maturity is determined by using established protocols in laborat...Background Fiber maturity is a key cotton quality property,and its variability in a sample impacts fiber processing and dyeing performance.Currently,the maturity is determined by using established protocols in laboratories under a controlled environment.There is an increasing need to measure fiber maturity using low-cost(in general less than $20000)and small portable systems.In this study,a laboratory feasibility was performed to assess the ability of the shortwave infrared hyperspectral imaging(SWIR HSI)technique for determining the conditioned fiber maturity,and as a comparison,a bench-top commercial and expensive(in general greater than $60000)near infrared(NIR)instrument was used.Results Although SWIR HSI and NIR represent different measurement technologies,consistent spectral characteristics were observed between the two instruments when they were used to measure the maturity of the locule fiber samples in seed cotton and of the well-defined fiber samples,respectively.Partial least squares(PLS)models were established using different spectral preprocessing parameters to predict fiber maturity.The high prediction precision was observed by a lower root mean square error of prediction(RMSEP)(<0.046),higher R_(p)^(2)(>0.518),and greater percentage(97.0%)of samples within the 95% agreement range in the entire NIR region(1000-2500 nm)without the moisture band at 1940 nm.Conclusion SWIR HSI has a good potential for assessing cotton fiber maturity in a laboratory environment.展开更多
Accurate assessment of canopy carotenoid content(CC_(x+c)C)in crops is central to monitor physiological conditions in plants and vegetation stress,and consequently supporting agronomic decisions.However,due to the ove...Accurate assessment of canopy carotenoid content(CC_(x+c)C)in crops is central to monitor physiological conditions in plants and vegetation stress,and consequently supporting agronomic decisions.However,due to the overlap of absorption peaks of carotenoid(C_(x+c))and chlorophyll(C_(a+b)),accurate estimation of carotenoid using reflectance where carotenoid absorb is challenging.The objective of present study was to assess CC_(x+c)C in winter wheat(Triticum aestivum L.)with ground-and aircraft-based hyperspectral measurements in the visible and near-infrared spectrum.In-situ hyperspectral reflectance were measured and airborne hyperspectral data were acquired during major growth stages of winter wheat in five consecutive field experiments.At the canopy level,a remarkable linear relationship(R^(2)=0.95,p<0.001)existed between C_(x+c) and Ca+b,and correlation between CC_(x+c)C and wavelengths within 400 to 1000 nm range indicated that CC_(x+c)C could be estimated using reflectance ranging from visible to near-infrared wavebands.Results of Cx+c assessment based on chlorophyll and carotenoid indices showed that red edge chlorophyll index(CI red edge)performed with the highest accuracy(R^(2)=0.77,RMSE=22.27μg/cm^(2),MAE=4.97μg/cm^(2)).Applying partial least square regression(PLSR)in CC_(x+c)C retrieval emphasized the significance of reflectance within 700 to 750 nm range in CC_(x+c)C assessment.Based on CI red edge index,use of airborne hyperspectral imagery achieved satisfactory results in mapping the spatial distribution of CC_(x+c)C.This study demonstrates that it is feasible to accurately assess CC_(x+c)C in winter wheat with red edge chlorophyll index provided that C_(x+c) correlated well with C_(a+b) at the canopy scale.it is therefore a promising method for CC_(x+c)C retrieval at regional scale from aerial hyperspectral imagery.展开更多
Soil texture is an indicator of soil physical structure which delivers many ecological functions of soils such as thermal regime, plant growth, and soil quality. However, traditional methods for soil texture measureme...Soil texture is an indicator of soil physical structure which delivers many ecological functions of soils such as thermal regime, plant growth, and soil quality. However, traditional methods for soil texture measurement are time-consuming and labor-intensive. This study attempts to explore an indirect method for rapid estimating the texture of three subgroups of purple soils (i.e. calcareous, neutral, and acidic). 190 topsoil (0 - 10 cm) samples were collected from sloping croplands in Tongnan and Beibei Districts of Chongqing Municipality in China. Vis-NIR spectrum was measured and processed, and stepwise multiple linear regression (SMLR), partial least squares regression (PLSR), and back propagation neural network (BPNN) models were constructed to inform the soil texture. The clay fractions ranged from 4.40% to 27.12% while sand fractions ranged from 0.34% to 36.57%, hereby soil samples encompass three textural classes (i.e. silt, silt loam, and silty clay loam). For the original spectrum, the texture of calcareous and neutral purple soils was not significantly correlated with spectral reflectance and linear models (SMLR and PLSR) exhibited low prediction accuracy. The correlation coefficients and the goodness-of-fits between soil texture and the transformed spectra of all soil groups increased by continuum-removal (CR), first-order differential (R'), and second-order differential (R") transformations. Among them, the R" had the best performance in terms of improving the correlation coefficients and the goodness-of-fits. For the calcareous purple soil, the SMLR exceeds PLSR and BPNN with a higher coefficient of determination (R<sup>2</sup>) and the ratio of performance to inter-quartile distance (RPIQ) values and lower root mean square error of validation (RMSEV), but for the neutral and acidic purple soils, the PLSR model has a better prediction accuracy. In summary, the linear methods (SMLR and PLSR) are more reliable in estimating the texture of the three purple soil groups when using Vis-NIR spectroscopy inversion.展开更多
To adapt to the new requirement of the developing flatness control theory and technology, cubic patterns were introduced on the basis of the traditional linear, quadratic and quartic flatness basic patterns. Linear, q...To adapt to the new requirement of the developing flatness control theory and technology, cubic patterns were introduced on the basis of the traditional linear, quadratic and quartic flatness basic patterns. Linear, quadratic, cubic and quartic Legendre orthogonal polynomials were adopted to express the flatness basic patterns. In order to over- come the defects live in the existent recognition methods based on fuzzy, neural network and support vector regres- sion (SVR) theory, a novel flatness pattern recognition method based on least squares support vector regression (LS-SVR) was proposed. On this basis, for the purpose of determining the hyper-parameters of LS-SVR effectively and enhan- cing the recognition accuracy and generalization performance of the model, particle swarm optimization algorithm with leave-one-out (LOO) error as fitness function was adopted. To overcome the disadvantage of high computational complexity of naive cross-validation algorithm, a novel fast cross-validation algorithm was introduced to calculate the LOO error of LDSVR. Results of experiments on flatness data calculated by theory and a 900HC cold-rolling mill practically measured flatness signals demonstrate that the proposed approach can distinguish the types and define the magnitudes of the flatness defects effectively with high accuracy, high speed and strong generalization ability.展开更多
基金supported by the projects under the Innovation Team of the Safety Standards and Testing Technology for Agricultural Products of Zhejiang Province, China (Grant No.2010R50028)the National Key Technologies R&D Program of China during the 11th Five-Year Plan Period (Grant No.2006BAK02A18)
文摘Near infrared reflectance spectroscopy (NIRS), a non-destructive measurement technique, was combined with partial least squares regression discrimiant analysis (PLS-DA) to discriminate the transgenic (TCTP and mi166) and wild type (Zhonghua 11) rice. Furthermore, rice lines transformed with protein gene (OsTCTP) and regulation gene (Osmi166) were also discriminated by the NIRS method. The performances of PLS-DA in spectral ranges of 4 000-8 000 cm-1 and 4 000-10 000 cm-1 were compared to obtain the optimal spectral range. As a result, the transgenic and wild type rice were distinguished from each other in the range of 4 000-10 000 cm-1, and the correct classification rate was 100.0% in the validation test. The transgenic rice TCTP and mi166 were also distinguished from each other in the range of 4 000-10 000 cm-1, and the correct classification rate was also 100.0%. In conclusion, NIRS combined with PLS-DA can be used for the discrimination of transgenic rice.
基金the National Natural Science Foundation of China (41171281, 40701120)the Beijing Nova Program, China (2008B33)
文摘Estimating wheat grain protein content by remote sensing is important for assessing wheat quality at maturity and making grains harvest and purchase policies. However, spatial variability of soil condition, temperature, and precipitation will affect grain protein contents and these factors usually cannot be monitored accurately by remote sensing data from single image. In this research, the relationships between wheat protein content at maturity and wheat agronomic parameters at different growing stages were analyzed and multi-temporal images of Landsat TM were used to estimate grain protein content by partial least squares regression. Experiment data were acquired in the suburb of Beijing during a 2-yr experiment in the period from 2003 to 2004. Determination coefficient, average deviation of self-modeling, and deviation of cross- validation were employed to assess the estimation accuracy of wheat grain protein content. Their values were 0.88, 1.30%, 3.81% and 0.72, 5.22%, 12.36% for 2003 and 2004, respectively. The research laid an agronomic foundation for GPC (grain protein content) estimation by multi-temporal remote sensing. The results showed that it is feasible to estimate GPC of wheat from multi-temporal remote sensing data in large area.
基金Supported by National Natural Science Foundation of China (No.50478086)Tianjin Special Scientific Innovation Foundation (No.06FZZDSH00900)
文摘The water distribution system of one residential district in Tianjin is taken as an example to analyze the changes of water quality.Partial least squares(PLS) regression model,in which the turbidity and Fe are regarded as control objectives,is used to establish the statistical model.The experimental results indicate that the PLS regression model has good predicted results of water quality compared with the monitored data.The percentages of absolute relative error(below 15%,20%,30%) are 44.4%,66.7%,100%(turbidity) and 33.3%,44.4%,77.8%(Fe) on the 4th sampling point;77.8%,88.9%,88.9%(turbidity) and 44.4%,55.6%,66.7%(Fe) on the 5th sampling point.
文摘The computer auxiliary partial least squares is introduced to simultaneously determine the contents of Deoxyschizandin, Schisandrin, r-Schisandrin in the extracted solution of wuweizi. Regression analysis of the experimental results shows that the average recovery of each component is all in the range from 98.9% to 110.3% , which means the partial least squares regression spectrophotometry can circumvent the overlappirtg of absorption spectrums of mlulti-components, so that sctisfactory results can be obtained without any scrapple pre-separation.
基金Project supported by the Fundamental Research Funds for the Central Universities, China (Grant No. 2019XD-A02)the National Natural Science Foundation of China (Grant Nos. U1636106, 61671087, 61170272, and 92046001)+2 种基金Natural Science Foundation of Beijing Municipality, China (Grant No. 4182006)Technological Special Project of Guizhou Province, China (Grant No. 20183001)the Foundation of Guizhou Provincial Key Laboratory of Public Big Data (Grant Nos. 2018BDKFJJ016 and 2018BDKFJJ018)。
文摘Partial least squares(PLS) regression is an important linear regression method that efficiently addresses the multiple correlation problem by combining principal component analysis and multiple regression. In this paper, we present a quantum partial least squares(QPLS) regression algorithm. To solve the high time complexity of the PLS regression, we design a quantum eigenvector search method to speed up principal components and regression parameters construction. Meanwhile, we give a density matrix product method to avoid multiple access to quantum random access memory(QRAM)during building residual matrices. The time and space complexities of the QPLS regression are logarithmic in the independent variable dimension n, the dependent variable dimension w, and the number of variables m. This algorithm achieves exponential speed-ups over the PLS regression on n, m, and w. In addition, the QPLS regression inspires us to explore more potential quantum machine learning applications in future works.
基金The authors acknowledged that this work was financially supported by Major Scientific and Technological Project of Henan Province(Grant No.161100110600)Key Scientific and Technological Project of Henan Province(No.212102310491,No.182102310060)+3 种基金China Postdoctoral Science Foundation(No.2018M632767)Henan Postdoctoral Science Foundation(No.001801021)Youth Talents Support Project of Henan Province(No.2018HYTP008)and Bainong Outstanding Talents Project of Henan Institute of Science and Technology(No.BNYC2018-2-27).
文摘Pseudomonas spp.and Enterobacteriaceae are dominant spoilage bacteria in chicken during cold storage(0°C-4°C).In this study,high resolution spectra in the range of 900-1700 nm were acquired and preprocessed using Savitzky-Golay convolution smoothing(SGCS),standard normal variate(SNV)and multiplicative scatter correction(MSC),respectively,and then mined using partial least squares(PLS)algorithm to relate to the total counts of Pseudomonas spp.and Enterobacteriaceae(PEC)of fresh chicken breasts to predict PEC rapidly.The results showed that with full 900-1700 nm range wavelength,MSC-PLS model built with MSC spectra performed better than PLS models with other spectra(RAW-PLS,SGCS-PLS,SNV-PLS),with correlation coefficient(RP)of 0.954,root mean square error of prediction(RMSEP)of 0.396 log10 CFU/g and residual predictive deviation(RPD)of 3.33 in prediction set.Based on the 12 optimal wavelengths(902.2 nm,905.5 nm,923.6 nm,938.4 nm,946.7 nm,1025.7 nm,1124.4 nm,1211.6 nm,1269.2 nm,1653.7 nm,1691.8 nm and 1693.4 nm)selected from MSC spectra by successive projections algorithm(SPA),SPA-MSC-PLS model had RP of 0.954,RMSEP of 0.397 log10 CFU/g and RPD of 3.32,similar to MSC-PLS model.The overall study indicated that NIR spectra combined with PLS algorithm could be used to detect the PEC of chicken flesh in a rapid and non-destructive way.
基金supported in part by the National Natural Science Founda-tion of China (Grant Nos.51069017,41071026)their sincere appreciation of the reviewers’ valuable suggestions and comments in improving the quality of this paper
文摘This study presented the application of partial least squares regression (PLSR) in estimating daily pan evaporation by utilizing the unique feature of PLSR in eliminating collinearity issues in predictor variables. The climate variables and daily pan evaporation data measured at two weather stations located near Elephant Butte Reservoir, New Mexico, USA and a weather station located in Shanshan County, Xinjiang, China were used in the study. The nonlinear relationship between climate variables and daily pan evaporation was successfully modeled using PLSR approach by solving collinearity that exists in the climate variables. The modeling results were compared to artificial neural networks (ANN) models with the same input variables. The resuits showed that the nonlinear equations developed using PLSR has similar performance with complex ANN approach for the study sites. The modeling process was straightforward and the equations were simpler and more explicit than the ANN black-box models.
基金supported by the National Natural Science Foundation of China (61074127)
文摘As the solutions of the least squares support vector regression machine (LS-SVRM) are not sparse, it leads to slow prediction speed and limits its applications. The defects of the ex- isting adaptive pruning algorithm for LS-SVRM are that the training speed is slow, and the generalization performance is not satis- factory, especially for large scale problems. Hence an improved algorithm is proposed. In order to accelerate the training speed, the pruned data point and fast leave-one-out error are employed to validate the temporary model obtained after decremental learning. The novel objective function in the termination condition which in- volves the whole constraints generated by all training data points and three pruning strategies are employed to improve the generali- zation performance. The effectiveness of the proposed algorithm is tested on six benchmark datasets. The sparse LS-SVRM model has a faster training speed and better generalization performance.
基金Project(50675186) supported by the National Natural Science Foundation of China
文摘To overcome the disadvantage that the standard least squares support vector regression(LS-SVR) algorithm is not suitable to multiple-input multiple-output(MIMO) system modelling directly,an improved LS-SVR algorithm which was defined as multi-output least squares support vector regression(MLSSVR) was put forward by adding samples' absolute errors in objective function and applied to flatness intelligent control.To solve the poor-precision problem of the control scheme based on effective matrix in flatness control,the predictive control was introduced into the control system and the effective matrix-predictive flatness control method was proposed by combining the merits of the two methods.Simulation experiment was conducted on 900HC reversible cold roll.The performance of effective matrix method and the effective matrix-predictive control method were compared,and the results demonstrate the validity of the effective matrix-predictive control method.
基金Supported by the Ministerial Level Advanced Research Foundation(3031030)the"111"Project(B08043)
文摘A method of multiple outputs least squares support vector regression (LS-SVR) was developed and described in detail, with the radial basis function (RBF) as the kernel function. The method was applied to predict the future state of the power-shift steering transmission (PSST). A prediction model of PSST was gotten with multiple outputs LS-SVR. The model performance was greatly influenced by the penalty parameter γ and kernel parameter σ2 which were optimized using cross validation method. The training and prediction of the model were done with spectrometric oil analysis data. The predictive and actual values were compared and a fault in the second PSST was found. The research proved that this method had good accuracy in PSST fault prediction, and any possible problem in PSST could be found through a comparative analysis.
基金supported by the National Natural Science Foundation of China(50576033)
文摘The pruning algorithms for sparse least squares support vector regression machine are common methods, and easily com- prehensible, but the computational burden in the training phase is heavy due to the retraining in performing the pruning process, which is not favorable for their applications. To this end, an im- proved scheme is proposed to accelerate sparse least squares support vector regression machine. A major advantage of this new scheme is based on the iterative methodology, which uses the previous training results instead of retraining, and its feasibility is strictly verified theoretically. Finally, experiments on bench- mark data sets corroborate a significant saving of the training time with the same number of support vectors and predictive accuracy compared with the original pruning algorithms, and this speedup scheme is also extended to classification problem.
文摘During the course of calculating the rice evapotranspiration using weather factors,we often find that some independent variables have multiple correlation.The phenomena can lead to the traditional multivariate regression model which based on least square method distortion.And the stability of the model will be lost.The model will be built based on partial least square regression in the paper,through applying the idea of main component analyze and typical correlation analyze,the writer picks up some component from original material.Thus,the writer builds up the model of rice evapotranspiration to solve the multiple correlation among the independent variables (some weather factors).At last,the writer analyses the model in some parts,and gains the satisfied result.
基金financial supports from National Natural Science Foundation of China(No.62205172)Huaneng Group Science and Technology Research Project(No.HNKJ22-H105)Tsinghua University Initiative Scientific Research Program and the International Joint Mission on Climate Change and Carbon Neutrality。
文摘Laser-induced breakdown spectroscopy(LIBS)has become a widely used atomic spectroscopic technique for rapid coal analysis.However,the vast amount of spectral information in LIBS contains signal uncertainty,which can affect its quantification performance.In this work,we propose a hybrid variable selection method to improve the performance of LIBS quantification.Important variables are first identified using Pearson's correlation coefficient,mutual information,least absolute shrinkage and selection operator(LASSO)and random forest,and then filtered and combined with empirical variables related to fingerprint elements of coal ash content.Subsequently,these variables are fed into a partial least squares regression(PLSR).Additionally,in some models,certain variables unrelated to ash content are removed manually to study the impact of variable deselection on model performance.The proposed hybrid strategy was tested on three LIBS datasets for quantitative analysis of coal ash content and compared with the corresponding data-driven baseline method.It is significantly better than the variable selection only method based on empirical knowledge and in most cases outperforms the baseline method.The results showed that on all three datasets the hybrid strategy for variable selection combining empirical knowledge and data-driven algorithms achieved the lowest root mean square error of prediction(RMSEP)values of 1.605,3.478 and 1.647,respectively,which were significantly lower than those obtained from multiple linear regression using only 12 empirical variables,which are 1.959,3.718 and 2.181,respectively.The LASSO-PLSR model with empirical support and 20 selected variables exhibited a significantly improved performance after variable deselection,with RMSEP values dropping from 1.635,3.962 and 1.647 to 1.483,3.086 and 1.567,respectively.Such results demonstrate that using empirical knowledge as a support for datadriven variable selection can be a viable approach to improve the accuracy and reliability of LIBS quantification.
基金National High-tech Research and Development Pro-gram (2006AA04Z405)
文摘In order to deal with the issue of huge computational cost very well in direct numerical simulation, the traditional response surface method (RSM) as a classical regression algorithm is used to approximate a functional relationship between the state variable and basic variables in reliability design. The algorithm has treated successfully some problems of implicit performance function in reliability analysis. However, its theoretical basis of empirical risk minimization narrows its range of applications for...
基金supported by the National Key Research and Development Program of China (2021YFF1001204,2017YFD0101500)the MOE Program of Introducing Talents of Discipline to Universities (“111”Project, B08025)+4 种基金the MOE Program for Changjiang Scholars and Innovative Research Team in University (PCSIRT_17R55)the MARA CARS-04 Programthe Jiangsu Higher Education PAPD Programthe Fundamental Research Funds for the Central Universities (KYZZ201901)the Jiangsu JCICMCP Program。
文摘“Breeding by design” for pure lines may be achieved by construction of an additive QTL-allele matrix in a germplasm panel or breeding population, but this option is not available for hybrids, where both additive and dominance QTL-allele matrices must be constructed. In this study, a hybrid-QTL identification approach, designated PLSRGA, using partial least squares regression(PLSR) for model fitting integrated with a genetic algorithm(GA) for variable selection based on a multi-locus, multi-allele model is described for additive and dominance QTL-allele detection in a diallel hybrid population(DHP). The PLSRGA was shown by simulation experiments to be superior to single-marker analysis and was then used for QTL-allele identification in a soybean DPH yield experiment with eight parents. Twenty-eight main-effect QTL with 138 alleles and nine QTL × environment QTL with 46 alleles were identified, with respective contributions of 61.8% and 23.5% of phenotypic variation. Main-effect additive and dominance QTL-allele matrices were established as a compact form of the DHP genetic structure. The mechanism of heterosis superior-to-parents(or superior-to-parents heterosis, SPH) was explored and might be explained by a complementary locus-set composed of OD+(showing positive over-dominance, most often), PD+(showing positive partial-to-complete dominance, less often) and HA+(showing positive homozygous additivity, occasionally) loci, depending on the parental materials. Any locus-type, whether OD+, PD + and HA+, could be the best genotype of a locus. All hybrids showed various numbers of better or best genotypes at many but not necessarily all loci, indicating further SPH improvement. Based on the additive/dominance QTL-allele matrices, the best hybrid genotype was predicted, and a hybrid improvement approach is suggested. PLSRGA is powerful for hybrid QTL-allele detection and cross-SPH improvement.
基金supported in part by the National Natural Science Foundation of China(Grant No.82072019)the Shenzhen Basic Research Program(JCYJ20210324130209023)of Shenzhen Science and Technology Innovation Committee+6 种基金the Shenzhen-Hong Kong-Macao S&T Program(Category C)(SGDX20201103095002019)the Natural Science Foundation of Jiangsu Province(No.BK20201441)the Provincial and Ministry Co-constructed Project of Henan Province Medical Science and Technology Research(SBGJ202103038 and SBGJ202102056)the Henan Province Key R&D and Promotion Project(Science and Technology Research)(222102310015)the Natural Science Foundation of Henan Province(222300420575)the Henan Province Science and Technology Research(222102310322)The Jiangsu Students’Innovation and Entrepreneurship Training Program(202110304096Y).
文摘Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-tic EEG signals and develop artificial intelligence(AI)-assist recognition,a multi-view transfer learning(MVTL-LSR)algorithm based on least squares regression is proposed in this study.Compared with most existing multi-view transfer learning algorithms,MVTL-LSR has two merits:(1)Since traditional transfer learning algorithms leverage knowledge from different sources,which poses a significant risk to data privacy.Therefore,we develop a knowledge transfer mechanism that can protect the security of source domain data while guaranteeing performance.(2)When utilizing multi-view data,we embed view weighting and manifold regularization into the transfer framework to measure the views’strengths and weaknesses and improve generalization ability.In the experimental studies,12 different simulated multi-view&transfer scenarios are constructed from epileptic EEG signals licensed and provided by the Uni-versity of Bonn,Germany.Extensive experimental results show that MVTL-LSR outperforms baselines.The source code will be available on https://github.com/didid5/MVTL-LSR.
基金supported partially by the USDA-ARS Research Project#6054-44000-080-00D.
文摘Background Fiber maturity is a key cotton quality property,and its variability in a sample impacts fiber processing and dyeing performance.Currently,the maturity is determined by using established protocols in laboratories under a controlled environment.There is an increasing need to measure fiber maturity using low-cost(in general less than $20000)and small portable systems.In this study,a laboratory feasibility was performed to assess the ability of the shortwave infrared hyperspectral imaging(SWIR HSI)technique for determining the conditioned fiber maturity,and as a comparison,a bench-top commercial and expensive(in general greater than $60000)near infrared(NIR)instrument was used.Results Although SWIR HSI and NIR represent different measurement technologies,consistent spectral characteristics were observed between the two instruments when they were used to measure the maturity of the locule fiber samples in seed cotton and of the well-defined fiber samples,respectively.Partial least squares(PLS)models were established using different spectral preprocessing parameters to predict fiber maturity.The high prediction precision was observed by a lower root mean square error of prediction(RMSEP)(<0.046),higher R_(p)^(2)(>0.518),and greater percentage(97.0%)of samples within the 95% agreement range in the entire NIR region(1000-2500 nm)without the moisture band at 1940 nm.Conclusion SWIR HSI has a good potential for assessing cotton fiber maturity in a laboratory environment.
基金supported by the Fundamental Research Funds for the Provincial Universities of Zhejiang(Project No.GK229909299001-302)the National Natural Science Foundation of China(Project No.41901268)+1 种基金the Natural Science Foundation of Zhejiang Province(Project No.LQ19D010009)the Provincial Education Department General Scientific Research Items(Project No.Y202249845).
文摘Accurate assessment of canopy carotenoid content(CC_(x+c)C)in crops is central to monitor physiological conditions in plants and vegetation stress,and consequently supporting agronomic decisions.However,due to the overlap of absorption peaks of carotenoid(C_(x+c))and chlorophyll(C_(a+b)),accurate estimation of carotenoid using reflectance where carotenoid absorb is challenging.The objective of present study was to assess CC_(x+c)C in winter wheat(Triticum aestivum L.)with ground-and aircraft-based hyperspectral measurements in the visible and near-infrared spectrum.In-situ hyperspectral reflectance were measured and airborne hyperspectral data were acquired during major growth stages of winter wheat in five consecutive field experiments.At the canopy level,a remarkable linear relationship(R^(2)=0.95,p<0.001)existed between C_(x+c) and Ca+b,and correlation between CC_(x+c)C and wavelengths within 400 to 1000 nm range indicated that CC_(x+c)C could be estimated using reflectance ranging from visible to near-infrared wavebands.Results of Cx+c assessment based on chlorophyll and carotenoid indices showed that red edge chlorophyll index(CI red edge)performed with the highest accuracy(R^(2)=0.77,RMSE=22.27μg/cm^(2),MAE=4.97μg/cm^(2)).Applying partial least square regression(PLSR)in CC_(x+c)C retrieval emphasized the significance of reflectance within 700 to 750 nm range in CC_(x+c)C assessment.Based on CI red edge index,use of airborne hyperspectral imagery achieved satisfactory results in mapping the spatial distribution of CC_(x+c)C.This study demonstrates that it is feasible to accurately assess CC_(x+c)C in winter wheat with red edge chlorophyll index provided that C_(x+c) correlated well with C_(a+b) at the canopy scale.it is therefore a promising method for CC_(x+c)C retrieval at regional scale from aerial hyperspectral imagery.
文摘Soil texture is an indicator of soil physical structure which delivers many ecological functions of soils such as thermal regime, plant growth, and soil quality. However, traditional methods for soil texture measurement are time-consuming and labor-intensive. This study attempts to explore an indirect method for rapid estimating the texture of three subgroups of purple soils (i.e. calcareous, neutral, and acidic). 190 topsoil (0 - 10 cm) samples were collected from sloping croplands in Tongnan and Beibei Districts of Chongqing Municipality in China. Vis-NIR spectrum was measured and processed, and stepwise multiple linear regression (SMLR), partial least squares regression (PLSR), and back propagation neural network (BPNN) models were constructed to inform the soil texture. The clay fractions ranged from 4.40% to 27.12% while sand fractions ranged from 0.34% to 36.57%, hereby soil samples encompass three textural classes (i.e. silt, silt loam, and silty clay loam). For the original spectrum, the texture of calcareous and neutral purple soils was not significantly correlated with spectral reflectance and linear models (SMLR and PLSR) exhibited low prediction accuracy. The correlation coefficients and the goodness-of-fits between soil texture and the transformed spectra of all soil groups increased by continuum-removal (CR), first-order differential (R'), and second-order differential (R") transformations. Among them, the R" had the best performance in terms of improving the correlation coefficients and the goodness-of-fits. For the calcareous purple soil, the SMLR exceeds PLSR and BPNN with a higher coefficient of determination (R<sup>2</sup>) and the ratio of performance to inter-quartile distance (RPIQ) values and lower root mean square error of validation (RMSEV), but for the neutral and acidic purple soils, the PLSR model has a better prediction accuracy. In summary, the linear methods (SMLR and PLSR) are more reliable in estimating the texture of the three purple soil groups when using Vis-NIR spectroscopy inversion.
基金Sponsored by National Natural Science Foundation of China (50675186)
文摘To adapt to the new requirement of the developing flatness control theory and technology, cubic patterns were introduced on the basis of the traditional linear, quadratic and quartic flatness basic patterns. Linear, quadratic, cubic and quartic Legendre orthogonal polynomials were adopted to express the flatness basic patterns. In order to over- come the defects live in the existent recognition methods based on fuzzy, neural network and support vector regres- sion (SVR) theory, a novel flatness pattern recognition method based on least squares support vector regression (LS-SVR) was proposed. On this basis, for the purpose of determining the hyper-parameters of LS-SVR effectively and enhan- cing the recognition accuracy and generalization performance of the model, particle swarm optimization algorithm with leave-one-out (LOO) error as fitness function was adopted. To overcome the disadvantage of high computational complexity of naive cross-validation algorithm, a novel fast cross-validation algorithm was introduced to calculate the LOO error of LDSVR. Results of experiments on flatness data calculated by theory and a 900HC cold-rolling mill practically measured flatness signals demonstrate that the proposed approach can distinguish the types and define the magnitudes of the flatness defects effectively with high accuracy, high speed and strong generalization ability.