The UV absorption spectra of o-naphthol,α-naphthylamine,2,7-dihydroxy naphthalene,2,4-dimethoxy ben- zaldehyde and methyl salicylate,overlap severely;therefore it is impossible to determine them in mixtures by tradit...The UV absorption spectra of o-naphthol,α-naphthylamine,2,7-dihydroxy naphthalene,2,4-dimethoxy ben- zaldehyde and methyl salicylate,overlap severely;therefore it is impossible to determine them in mixtures by traditional spectrophotometric methods.In this paper,the partial least-squares(PLS)regression is applied to the simultaneous determination of these compounds in mixtures by UV spectrophtometry without any pretreatment of the samples.Ten synthetic mixture samples are analyzed by the proposed method.The mean recoveries are 99.4%,996%,100.2%,99.3% and 99.1%,and the relative standard deviations(RSD) are 1.87%,1.98%,1.94%,0.960% and 0.672%,respectively.展开更多
Breast cancer is one of the malignant tumors having high incidence in women,the incidence of breast cancer has increased in all parts of the world since twentieth century,but its etiology is not yet completely clear,s...Breast cancer is one of the malignant tumors having high incidence in women,the incidence of breast cancer has increased in all parts of the world since twentieth century,but its etiology is not yet completely clear,so it is very important to detect breast cells.In this paper,we built a regression model to detect breast cells,and generated a method for predicting the formation of benign and malignant breast cells by training the model,then we used the 10 features of breast cells to predict it,the results reaching upto 93.67%accuracy,it was very effective to predict and analyse whether the breast cells getting cancer,It had an important role in the diagnosis and prevention of breast cancer.展开更多
Partial least squares(PLS)model is the most typical data-driven method for quality-related industrial tasks like soft sensor.However,only linear relations are captured between the input and output data in the PLS.It i...Partial least squares(PLS)model is the most typical data-driven method for quality-related industrial tasks like soft sensor.However,only linear relations are captured between the input and output data in the PLS.It is difficult to obtain the remaining nonlinear information in the residual subspaces,which may deteriorate the prediction performance in complex industrial processes.To fully utilize data information in PLS residual subspaces,a deep residual PLS(DRPLS)framework is proposed for quality prediction in this paper.Inspired by deep learning,DRPLS is designed by stacking a number of PLSs successively,in which the input residuals of the previous PLS are used as the layer connection.To enhance representation,nonlinear function is applied to the input residuals before using them for stacking highlevel PLS.For each PLS,the output parts are just the output residuals from its previous PLS.Finally,the output prediction is obtained by adding the results of each PLS.The effectiveness of the proposed DRPLS is validated on an industrial hydrocracking process.展开更多
Chemical oxygen demand (COD) is an important index to measure the degree of water pollution. In this paper, near-infrared technology is used to obtain 148 wastewater spectra to predict the COD value in wastewater. Fir...Chemical oxygen demand (COD) is an important index to measure the degree of water pollution. In this paper, near-infrared technology is used to obtain 148 wastewater spectra to predict the COD value in wastewater. First, the partial least squares regression (PLS) model was used as the basic model. Monte Carlo cross-validation (MCCV) was used to select 25 samples out of 148 samples that did not conform to conventional statistics. Then, the interval partial least squares (iPLS) regression modeling was carried out on 123 samples, and the spectral bands were divided into 40 subintervals. The optimal subintervals are 20 and 26, and the optimal correlation coefficient of the test set (RT) is 0.58. Further, the waveband is divided into five intervals: 17, 19, 20, 22 and 26. When the number of joint intervals under each interval is three, the optimal RT is 0.71. When the number of joint subintervals is four, the optimal RT is 0.79. Finally, convolutional neural network (CNN) was used for quantitative prediction, and RT was 0.9. The results show that CNN can automatically screen the features inside the data, and the quantitative prediction effect is better than that of iPLS and synergy interval partial least squares model (SiPLS) with joint subinterval three and four, indicating that CNN can be used for quantitative analysis of water pollution degree.展开更多
Many complex traits are highly correlated rather than independent. By taking the correlation structure of multiple traits into account, joint association analyses can achieve both higher statistical power and more acc...Many complex traits are highly correlated rather than independent. By taking the correlation structure of multiple traits into account, joint association analyses can achieve both higher statistical power and more accurate estimation. To develop a statistical approach to joint association analysis that includes allele detection and genetic effect estimation, we combined multivariate partial least squares regression with variable selection strategies and selected the optimal model using the Bayesian Information Criterion(BIC). We then performed extensive simulations under varying heritabilities and sample sizes to compare the performance achieved using our method with those obtained by single-trait multilocus methods. Joint association analysis has measurable advantages over single-trait methods, as it exhibits superior gene detection power, especially for pleiotropic genes. Sample size, heritability,polymorphic information content(PIC), and magnitude of gene effects influence the statistical power, accuracy and precision of effect estimation by the joint association analysis.展开更多
The identification of liquor brands is very important for food safety. Most of the fake liquors are usually made into the products with the same flavor and alcohol content as regular brand, so the identification for t...The identification of liquor brands is very important for food safety. Most of the fake liquors are usually made into the products with the same flavor and alcohol content as regular brand, so the identification for the liquor brands with the same flavor and the same alcohol content is essential. However, it is also difficult because the components of such liquor samples are very similar. Near-infrared (NIR) spectroscopy combined with partial least squares discriminant analysis (PLS-DA) was applied to identification of liquor brands with the same flavor and alcohol content. A total of 160 samples of Luzhou Laojiao liquor and 200 samples of non-Luzhou Laojiao liquor with the same flavor and alcohol content were used for identification. Samples of each type were randomly divided into the modeling and validation sets. The modeling samples were further divided into calibration and prediction sets using the Kennard-Stone algorithm to achieve uniformity and representativeness. In the modeling and validation processes based on PLS-DA method, the recognition rates of samples achieved 99.1% and 98.7%, respectively. The results show high prediction performance for the identification of liquor brands, and were obviously better than those obtained from the principal component linear discriminant analysis method. NIR spectroscopy combined with the PLS-DA method provides a quick and effective means of the discriminant analysis of liquor brands, and is also a promising tool for large-scale inspection of liquor food safety.展开更多
Near-infrared (NIR) spectroscopy was applied to reagent-free quantitative analysis of polysaccharide of a brand product of proprietary Chinese medicine (PCM) oral solution samples. A novel method, called absorbance up...Near-infrared (NIR) spectroscopy was applied to reagent-free quantitative analysis of polysaccharide of a brand product of proprietary Chinese medicine (PCM) oral solution samples. A novel method, called absorbance upper optimization partial least squares (AUO-PLS), was proposed and successfully applied to the wavelength selection. Based on varied partitioning of the calibration and prediction sample sets, the parameter optimization was performed to achieve stability. On the basis of the AUO-PLS method, the selected upper bound of appropriate absorbance was 1.53 and the corresponding wavebands combination was 400 - 1880 & 2088 - 2346 nm. With the use of random validation samples excluded from the modeling process, the root-mean-square error and correlation coefficient of prediction for polysaccharide were 27.09 mg·L<sup>-</sup><sup>1</sup> and 0.888, respectively. The results indicate that the NIR prediction values are close to those of the measured values. NIR spectroscopy combined with AUO-PLS method provided a promising tool for quantification of the polysaccharide for PCM oral solution and this technique is rapid and simple when compared with conventional methods.展开更多
Scientific forecasting water yield of mine is of great significance to the safety production of mine and the colligated using of water resources. The paper established the forecasting model for water yield of mine, co...Scientific forecasting water yield of mine is of great significance to the safety production of mine and the colligated using of water resources. The paper established the forecasting model for water yield of mine, combining neural network with the partial least square method. Dealt with independent variables by the partial least square method, it can not only solve the relationship between independent variables but also reduce the input dimensions in neural network model, and then use the neural network which can solve the non-linear problem better. The result of an example shows that the prediction has higher precision in forecasting and fitting.展开更多
Simultaneous determination of heavy metal cations and accurate quantitative prediction of them are of great interest in analytical chemistry.This work has focused on a comprehensive comparison of partial least squares...Simultaneous determination of heavy metal cations and accurate quantitative prediction of them are of great interest in analytical chemistry.This work has focused on a comprehensive comparison of partial least squares(PLS-1)and artificial neural networks(ANN)as two types of chemometric methods.For this purpose,aluminum,iron and copper were studied as three analytes whose UV-Vis absorption spectra highly overlap each other.Accordance with determined parameters(ligand concentration,pH,waiting times,the relationship between absorbance and concentration of metal ion effect and foreign ions)are provided and the optimum conditions.After establishing the optimum conditions for Fe^(3+),Al^(3+) and Cu^(2+) containing mixtures spectrophotometric determinations and the data calibration method of least squares(PLS-1)regression,and artificial neural network(ANN)methods were used.Chemometric methods are applied in a fast,simple,and the results are applicable.展开更多
Based on the surveying data of strata-moving angle and the ordinary least squares regression, this paper is to construct, a regression model is constructed which is strata-moving parameter β concerning the coal bed o...Based on the surveying data of strata-moving angle and the ordinary least squares regression, this paper is to construct, a regression model is constructed which is strata-moving parameter β concerning the coal bed obliquity, coal thickness, mining depth, etc. But the regression is unsuccessful. The result is that none of the parameters is suited, this is not up to objective reality. This paper presents a novel method, partial least squares regression (PLS regression), to construct the statistic model of strata-moving parameter β. The experiment shows that the forecasting model is reasonable.展开更多
Partial least squares(PLS) regression is an important linear regression method that efficiently addresses the multiple correlation problem by combining principal component analysis and multiple regression. In this pap...Partial least squares(PLS) regression is an important linear regression method that efficiently addresses the multiple correlation problem by combining principal component analysis and multiple regression. In this paper, we present a quantum partial least squares(QPLS) regression algorithm. To solve the high time complexity of the PLS regression, we design a quantum eigenvector search method to speed up principal components and regression parameters construction. Meanwhile, we give a density matrix product method to avoid multiple access to quantum random access memory(QRAM)during building residual matrices. The time and space complexities of the QPLS regression are logarithmic in the independent variable dimension n, the dependent variable dimension w, and the number of variables m. This algorithm achieves exponential speed-ups over the PLS regression on n, m, and w. In addition, the QPLS regression inspires us to explore more potential quantum machine learning applications in future works.展开更多
Near-infrared spectroscopy coupled with kernel partial least squares-discriminant analysis was used to rapidly screen water containing malathion. In the wavenumber of 4348 cm-1 to 9091 cm-1, the overall correct classi...Near-infrared spectroscopy coupled with kernel partial least squares-discriminant analysis was used to rapidly screen water containing malathion. In the wavenumber of 4348 cm-1 to 9091 cm-1, the overall correct classification rate of kernel partial least squares-discriminant analysis was 100% for training set, and 100% for test set, with the lowest concentration detected malathion residues in water being 1 μg·ml-1. Kernel partial least squares-discriminant analysis was able to have a good performance in classifying data in nonlinear systems. It was inferred that Near-infrared spectroscopy coupled with the kernel partial least squares-discriminant analysis had a potential in rapid screening other pesticide residues in water.展开更多
文摘The UV absorption spectra of o-naphthol,α-naphthylamine,2,7-dihydroxy naphthalene,2,4-dimethoxy ben- zaldehyde and methyl salicylate,overlap severely;therefore it is impossible to determine them in mixtures by traditional spectrophotometric methods.In this paper,the partial least-squares(PLS)regression is applied to the simultaneous determination of these compounds in mixtures by UV spectrophtometry without any pretreatment of the samples.Ten synthetic mixture samples are analyzed by the proposed method.The mean recoveries are 99.4%,996%,100.2%,99.3% and 99.1%,and the relative standard deviations(RSD) are 1.87%,1.98%,1.94%,0.960% and 0.672%,respectively.
文摘Breast cancer is one of the malignant tumors having high incidence in women,the incidence of breast cancer has increased in all parts of the world since twentieth century,but its etiology is not yet completely clear,so it is very important to detect breast cells.In this paper,we built a regression model to detect breast cells,and generated a method for predicting the formation of benign and malignant breast cells by training the model,then we used the 10 features of breast cells to predict it,the results reaching upto 93.67%accuracy,it was very effective to predict and analyse whether the breast cells getting cancer,It had an important role in the diagnosis and prevention of breast cancer.
基金supported in part by the National Natural Science Foundation of China(62173346,61988101,92267205,62103360,62303494)。
文摘Partial least squares(PLS)model is the most typical data-driven method for quality-related industrial tasks like soft sensor.However,only linear relations are captured between the input and output data in the PLS.It is difficult to obtain the remaining nonlinear information in the residual subspaces,which may deteriorate the prediction performance in complex industrial processes.To fully utilize data information in PLS residual subspaces,a deep residual PLS(DRPLS)framework is proposed for quality prediction in this paper.Inspired by deep learning,DRPLS is designed by stacking a number of PLSs successively,in which the input residuals of the previous PLS are used as the layer connection.To enhance representation,nonlinear function is applied to the input residuals before using them for stacking highlevel PLS.For each PLS,the output parts are just the output residuals from its previous PLS.Finally,the output prediction is obtained by adding the results of each PLS.The effectiveness of the proposed DRPLS is validated on an industrial hydrocracking process.
文摘Chemical oxygen demand (COD) is an important index to measure the degree of water pollution. In this paper, near-infrared technology is used to obtain 148 wastewater spectra to predict the COD value in wastewater. First, the partial least squares regression (PLS) model was used as the basic model. Monte Carlo cross-validation (MCCV) was used to select 25 samples out of 148 samples that did not conform to conventional statistics. Then, the interval partial least squares (iPLS) regression modeling was carried out on 123 samples, and the spectral bands were divided into 40 subintervals. The optimal subintervals are 20 and 26, and the optimal correlation coefficient of the test set (RT) is 0.58. Further, the waveband is divided into five intervals: 17, 19, 20, 22 and 26. When the number of joint intervals under each interval is three, the optimal RT is 0.71. When the number of joint subintervals is four, the optimal RT is 0.79. Finally, convolutional neural network (CNN) was used for quantitative prediction, and RT was 0.9. The results show that CNN can automatically screen the features inside the data, and the quantitative prediction effect is better than that of iPLS and synergy interval partial least squares model (SiPLS) with joint subinterval three and four, indicating that CNN can be used for quantitative analysis of water pollution degree.
基金supported by grants from the National Program on the Development of Basic Research (2011CB100100)the Priority Academic Program Development of Jiangsu Higher Education Institutions, the National Natural Science Foundations (31391632, 31200943, 31171187, and 91535103)+3 种基金the National High-tech R&D Program (863 Program) (2014AA10A601-5)the Natural Science Foundations of Jiangsu Province (BK20150010)the Natural Science Foundation of the Jiangsu Higher Education Institutions (14KJA210005)the Innovative Research Team of Universities in Jiangsu Province (KYLX_1352)
文摘Many complex traits are highly correlated rather than independent. By taking the correlation structure of multiple traits into account, joint association analyses can achieve both higher statistical power and more accurate estimation. To develop a statistical approach to joint association analysis that includes allele detection and genetic effect estimation, we combined multivariate partial least squares regression with variable selection strategies and selected the optimal model using the Bayesian Information Criterion(BIC). We then performed extensive simulations under varying heritabilities and sample sizes to compare the performance achieved using our method with those obtained by single-trait multilocus methods. Joint association analysis has measurable advantages over single-trait methods, as it exhibits superior gene detection power, especially for pleiotropic genes. Sample size, heritability,polymorphic information content(PIC), and magnitude of gene effects influence the statistical power, accuracy and precision of effect estimation by the joint association analysis.
文摘The identification of liquor brands is very important for food safety. Most of the fake liquors are usually made into the products with the same flavor and alcohol content as regular brand, so the identification for the liquor brands with the same flavor and the same alcohol content is essential. However, it is also difficult because the components of such liquor samples are very similar. Near-infrared (NIR) spectroscopy combined with partial least squares discriminant analysis (PLS-DA) was applied to identification of liquor brands with the same flavor and alcohol content. A total of 160 samples of Luzhou Laojiao liquor and 200 samples of non-Luzhou Laojiao liquor with the same flavor and alcohol content were used for identification. Samples of each type were randomly divided into the modeling and validation sets. The modeling samples were further divided into calibration and prediction sets using the Kennard-Stone algorithm to achieve uniformity and representativeness. In the modeling and validation processes based on PLS-DA method, the recognition rates of samples achieved 99.1% and 98.7%, respectively. The results show high prediction performance for the identification of liquor brands, and were obviously better than those obtained from the principal component linear discriminant analysis method. NIR spectroscopy combined with the PLS-DA method provides a quick and effective means of the discriminant analysis of liquor brands, and is also a promising tool for large-scale inspection of liquor food safety.
文摘Near-infrared (NIR) spectroscopy was applied to reagent-free quantitative analysis of polysaccharide of a brand product of proprietary Chinese medicine (PCM) oral solution samples. A novel method, called absorbance upper optimization partial least squares (AUO-PLS), was proposed and successfully applied to the wavelength selection. Based on varied partitioning of the calibration and prediction sample sets, the parameter optimization was performed to achieve stability. On the basis of the AUO-PLS method, the selected upper bound of appropriate absorbance was 1.53 and the corresponding wavebands combination was 400 - 1880 & 2088 - 2346 nm. With the use of random validation samples excluded from the modeling process, the root-mean-square error and correlation coefficient of prediction for polysaccharide were 27.09 mg·L<sup>-</sup><sup>1</sup> and 0.888, respectively. The results indicate that the NIR prediction values are close to those of the measured values. NIR spectroscopy combined with AUO-PLS method provided a promising tool for quantification of the polysaccharide for PCM oral solution and this technique is rapid and simple when compared with conventional methods.
基金Supported by "863" Program of P. R. China(2002AA2Z4291)
文摘Scientific forecasting water yield of mine is of great significance to the safety production of mine and the colligated using of water resources. The paper established the forecasting model for water yield of mine, combining neural network with the partial least square method. Dealt with independent variables by the partial least square method, it can not only solve the relationship between independent variables but also reduce the input dimensions in neural network model, and then use the neural network which can solve the non-linear problem better. The result of an example shows that the prediction has higher precision in forecasting and fitting.
文摘Simultaneous determination of heavy metal cations and accurate quantitative prediction of them are of great interest in analytical chemistry.This work has focused on a comprehensive comparison of partial least squares(PLS-1)and artificial neural networks(ANN)as two types of chemometric methods.For this purpose,aluminum,iron and copper were studied as three analytes whose UV-Vis absorption spectra highly overlap each other.Accordance with determined parameters(ligand concentration,pH,waiting times,the relationship between absorbance and concentration of metal ion effect and foreign ions)are provided and the optimum conditions.After establishing the optimum conditions for Fe^(3+),Al^(3+) and Cu^(2+) containing mixtures spectrophotometric determinations and the data calibration method of least squares(PLS-1)regression,and artificial neural network(ANN)methods were used.Chemometric methods are applied in a fast,simple,and the results are applicable.
基金Project(030501801) supported by the Key Laboratory of the State Bureau of Surveying and Mapping in Geographical Space InformationEngineering
文摘Based on the surveying data of strata-moving angle and the ordinary least squares regression, this paper is to construct, a regression model is constructed which is strata-moving parameter β concerning the coal bed obliquity, coal thickness, mining depth, etc. But the regression is unsuccessful. The result is that none of the parameters is suited, this is not up to objective reality. This paper presents a novel method, partial least squares regression (PLS regression), to construct the statistic model of strata-moving parameter β. The experiment shows that the forecasting model is reasonable.
基金Project supported by the Fundamental Research Funds for the Central Universities, China (Grant No. 2019XD-A02)the National Natural Science Foundation of China (Grant Nos. U1636106, 61671087, 61170272, and 92046001)+2 种基金Natural Science Foundation of Beijing Municipality, China (Grant No. 4182006)Technological Special Project of Guizhou Province, China (Grant No. 20183001)the Foundation of Guizhou Provincial Key Laboratory of Public Big Data (Grant Nos. 2018BDKFJJ016 and 2018BDKFJJ018)。
文摘Partial least squares(PLS) regression is an important linear regression method that efficiently addresses the multiple correlation problem by combining principal component analysis and multiple regression. In this paper, we present a quantum partial least squares(QPLS) regression algorithm. To solve the high time complexity of the PLS regression, we design a quantum eigenvector search method to speed up principal components and regression parameters construction. Meanwhile, we give a density matrix product method to avoid multiple access to quantum random access memory(QRAM)during building residual matrices. The time and space complexities of the QPLS regression are logarithmic in the independent variable dimension n, the dependent variable dimension w, and the number of variables m. This algorithm achieves exponential speed-ups over the PLS regression on n, m, and w. In addition, the QPLS regression inspires us to explore more potential quantum machine learning applications in future works.
文摘Near-infrared spectroscopy coupled with kernel partial least squares-discriminant analysis was used to rapidly screen water containing malathion. In the wavenumber of 4348 cm-1 to 9091 cm-1, the overall correct classification rate of kernel partial least squares-discriminant analysis was 100% for training set, and 100% for test set, with the lowest concentration detected malathion residues in water being 1 μg·ml-1. Kernel partial least squares-discriminant analysis was able to have a good performance in classifying data in nonlinear systems. It was inferred that Near-infrared spectroscopy coupled with the kernel partial least squares-discriminant analysis had a potential in rapid screening other pesticide residues in water.