Understanding the thermal stability of the proteins in human serum is essential since human serum is the important source of pharmaceutical proteins. Near-infrared(NIR) spectroscopy was applied to the investigation ...Understanding the thermal stability of the proteins in human serum is essential since human serum is the important source of pharmaceutical proteins. Near-infrared(NIR) spectroscopy was applied to the investigation of thermal changes in secondary structure and hydration of human serum proteins.However, as a multicomponent system, the overlap of the broad NIR bands makes the structural analysis very difficult directly using the spectra of serum samples. Therefore, continuous wavelet transform(CWT) was used to improve the resolution of NIR spectra, and Monte Carlo-uninformative variable elimination(MC-UVE) method was applied to the selection of the variables associated with the proteins for the structural analysis. The variables(5956, 5867, 5815, 5747, 4525, 4401, 4359 and 4328 cm^-1) related to protein secondary structures and those(7074, 6951, 6827 and 6700 cm 1) connected with water species were selected. Then, the thermal stability was analyzed through the intensity variations of the selected variables with temperature from 30℃ to 80 ℃. It was found that the variation of the spectral variables related to both a-helix and b-sheet changes apparently around 60 ℃, indicating the beginning of the thermal denaturation and the transition from a-helix to b-sheet. Moreover, an obvious change was found around 60℃for the content of the water specie S3, i.e., the water cluster containing three hydrogen bonds. The result demonstrates that MC-UVE can identify the protein-related NIR spectral variables, and the water species may be a marker for investigation of the structural change of proteins in biochemical systems.展开更多
Variable selection is applied widely for visible-near infrared(Vis-NIR)spectroscopy analysis of internal quality in fruits.Different spectral variable selection methods were compared for online quantitative analysis o...Variable selection is applied widely for visible-near infrared(Vis-NIR)spectroscopy analysis of internal quality in fruits.Different spectral variable selection methods were compared for online quantitative analysis of soluble solids content(SSC)in navel oranges.Moving window partial least squares(MW-PLS),Monte Carlo uninformative variables elimination(MC-UVE)and wavelet transform(WT)combined with the MC-UVE method were used to select the spectral variables and develop the calibration models of online analysis of SSC in navel oranges.The performances of these methods were compared for modeling the Vis NIR data sets of navel orange samples.Results show that the WT-MC-UVE methods gave better calibration models with the higher correlation cofficient(r)of 0.89 and lower root mean square error of prediction(RMSEP)of 0.54 at 5 fruits per second.It concluded that Vis NIR spectroscopy coupled with WT-MC-UVE may be a fast and efective tool for online quantitative analysis of SSC in navel oranges.展开更多
变量筛选策略结合局部线性嵌入(local linear embedding,LLE)理论用于近红外光谱(near infrared spectroscopy,NIRS)定量模型优化。蒙特卡罗无信息变量消除方法(monte carlo uninformation variable elimination,MCUVE)和连续投影算法(s...变量筛选策略结合局部线性嵌入(local linear embedding,LLE)理论用于近红外光谱(near infrared spectroscopy,NIRS)定量模型优化。蒙特卡罗无信息变量消除方法(monte carlo uninformation variable elimination,MCUVE)和连续投影算法(successive projections algorithm,SPA)以及两者结合的变量筛选策略用于NIRS冗余变量的剔除;偏最小二乘回归(partial least squares regression,PLSR)和LLE-PLSR用于复杂样品光谱定量模型的构建。结果表明:MCUVE方法既能有效的提取信息变量,同时可以提高模型的预测精度;LLE-PLSR可以得到比PLSR方法更加准确的定量分析模型;MCUVE结合LLE-PLSR是一种有效的光谱定量分析方法。展开更多
基于高光谱成像光谱信息的鱼新鲜度(鱼不同冷冻时间以及冻融次数)鉴别。首先,提取鱼样本感兴趣区域(region of interest,ROI)光谱,分别采用蒙特卡罗无信息变量消除(Monte Carlo free information variable elimination,MCVE),连续投影算...基于高光谱成像光谱信息的鱼新鲜度(鱼不同冷冻时间以及冻融次数)鉴别。首先,提取鱼样本感兴趣区域(region of interest,ROI)光谱,分别采用蒙特卡罗无信息变量消除(Monte Carlo free information variable elimination,MCVE),连续投影算法(successive projections algorithm,SPA)和随机青蛙算法(random frog,RF)提取特征波长,三种算法分别得到90,31和49个特征变量,采用最小二乘支持向量机作为分类模型,将90,31和49个特征变量作为LS-SVM模型的输入变量建立分类模型,基于SPA-LS-SVM和MCVE-LS-SVM模型预测集识别率都达到了98%,而采用RF-LS-SVM建立的模型取得了较差的预测结果 ,模型预测集识别率都只是达到了88%。结果表明,SPA-LS-SVM作为分类模型优于其他模型,SPA选择的特征波长,不但可以简化模型,还可以提高模型的预测精度,基于高光谱成像技术可以用于鱼新鲜度(鱼不同冷冻时间以及冻融次数)鉴别。展开更多
对葡萄酒酒精度偏最小二乘(Partial least squares,PLS)回归模型进行优化研究。使用近红外光谱仪采集葡萄酒样本的光谱数据,用于建立酒精度定量模型,实现在线快速检测。通过蒙特卡罗无信息变量消除(Monte Carlo uninformative variable ...对葡萄酒酒精度偏最小二乘(Partial least squares,PLS)回归模型进行优化研究。使用近红外光谱仪采集葡萄酒样本的光谱数据,用于建立酒精度定量模型,实现在线快速检测。通过蒙特卡罗无信息变量消除(Monte Carlo uninformative variable elimination,MC-UVE)和遗传算法(Genetic algorithm,GA)进行变量选择,基于被选择的变量分别进行PLS和因子分析(Factor analysis,FA),建立回归模型。结果表明,MC-UVE-GA-FAR模型预测集相关系数(R2)为0.946,预测均方根误差(Root mean square error of prediction,RMSEP)为0.215,效果优于MC-UVE-GA-PLS模型。与基于全范围光谱所建PLS回归模型相比,模型效果有所提升,而且模型所选变量个数仅为6,极大地简化了模型。MC-UVE和GA算法与FA分析结合可以实现模型的优化。展开更多
目的基于饲料近红外光谱数据筛选影响猪配合饲料主要品质指标的关键波长变量,从而建立饲料品质无损快速定量校正模型,进而提高饲料品质无损快速检测效率。方法采集饲料样品近红外光谱数据并获取水分、粗蛋白、粗脂肪、粗纤维参考值数据...目的基于饲料近红外光谱数据筛选影响猪配合饲料主要品质指标的关键波长变量,从而建立饲料品质无损快速定量校正模型,进而提高饲料品质无损快速检测效率。方法采集饲料样品近红外光谱数据并获取水分、粗蛋白、粗脂肪、粗纤维参考值数据;剔除异常值后采用基于联合X-Y距离样本集划分法(sample set partitioning based on joint X-Y distance, SPXY)划分校正集和外部验证集;基于校正集数据采用蒙特卡罗-无信息变量消除-连续投影算法分别针对4个品质指标筛选25、20、15、10、5个关键变量,分别建立校正模型并对外部验证集进行预测。结果针对饲料水分、粗蛋白、粗脂肪、粗纤维所选关键变量个数分别为15、25、15、15,模型维数分别为9、11、10、9,测定系数分别为0.8288、0.8605、0.9338、0.8327,校正均方根误差分别为0.17、0.81、0.31、0.22,交互验证均方根误差分别为0.19、0.93、0.34、0.23,相对预测性能分别为2.79、2.38、4.01、2.89。结论通过变量筛选结合外部验证结果表明,在保证模型准确度的前提下,所选关键变量数明显少于全谱变量数,可为提高饲料多品质无损快速定量检测工作效率提供一定的参考。展开更多
基金supported by National Natural Science Foundation of China(No.21475068)
文摘Understanding the thermal stability of the proteins in human serum is essential since human serum is the important source of pharmaceutical proteins. Near-infrared(NIR) spectroscopy was applied to the investigation of thermal changes in secondary structure and hydration of human serum proteins.However, as a multicomponent system, the overlap of the broad NIR bands makes the structural analysis very difficult directly using the spectra of serum samples. Therefore, continuous wavelet transform(CWT) was used to improve the resolution of NIR spectra, and Monte Carlo-uninformative variable elimination(MC-UVE) method was applied to the selection of the variables associated with the proteins for the structural analysis. The variables(5956, 5867, 5815, 5747, 4525, 4401, 4359 and 4328 cm^-1) related to protein secondary structures and those(7074, 6951, 6827 and 6700 cm 1) connected with water species were selected. Then, the thermal stability was analyzed through the intensity variations of the selected variables with temperature from 30℃ to 80 ℃. It was found that the variation of the spectral variables related to both a-helix and b-sheet changes apparently around 60 ℃, indicating the beginning of the thermal denaturation and the transition from a-helix to b-sheet. Moreover, an obvious change was found around 60℃for the content of the water specie S3, i.e., the water cluster containing three hydrogen bonds. The result demonstrates that MC-UVE can identify the protein-related NIR spectral variables, and the water species may be a marker for investigation of the structural change of proteins in biochemical systems.
基金support provided by National Natural Science Foundation of China (60844007,61178036,21265006)National Science and Technology Support Plan (2008BAD96B04)+1 种基金Special Science and Technology Support Program for Foreign Science and Technology Cooperation Plan (2009BHB15200)Technological expertise and academic leaders training plan of Jiangxi Province (2009DD00700)。
文摘Variable selection is applied widely for visible-near infrared(Vis-NIR)spectroscopy analysis of internal quality in fruits.Different spectral variable selection methods were compared for online quantitative analysis of soluble solids content(SSC)in navel oranges.Moving window partial least squares(MW-PLS),Monte Carlo uninformative variables elimination(MC-UVE)and wavelet transform(WT)combined with the MC-UVE method were used to select the spectral variables and develop the calibration models of online analysis of SSC in navel oranges.The performances of these methods were compared for modeling the Vis NIR data sets of navel orange samples.Results show that the WT-MC-UVE methods gave better calibration models with the higher correlation cofficient(r)of 0.89 and lower root mean square error of prediction(RMSEP)of 0.54 at 5 fruits per second.It concluded that Vis NIR spectroscopy coupled with WT-MC-UVE may be a fast and efective tool for online quantitative analysis of SSC in navel oranges.
文摘变量筛选策略结合局部线性嵌入(local linear embedding,LLE)理论用于近红外光谱(near infrared spectroscopy,NIRS)定量模型优化。蒙特卡罗无信息变量消除方法(monte carlo uninformation variable elimination,MCUVE)和连续投影算法(successive projections algorithm,SPA)以及两者结合的变量筛选策略用于NIRS冗余变量的剔除;偏最小二乘回归(partial least squares regression,PLSR)和LLE-PLSR用于复杂样品光谱定量模型的构建。结果表明:MCUVE方法既能有效的提取信息变量,同时可以提高模型的预测精度;LLE-PLSR可以得到比PLSR方法更加准确的定量分析模型;MCUVE结合LLE-PLSR是一种有效的光谱定量分析方法。
文摘基于高光谱成像光谱信息的鱼新鲜度(鱼不同冷冻时间以及冻融次数)鉴别。首先,提取鱼样本感兴趣区域(region of interest,ROI)光谱,分别采用蒙特卡罗无信息变量消除(Monte Carlo free information variable elimination,MCVE),连续投影算法(successive projections algorithm,SPA)和随机青蛙算法(random frog,RF)提取特征波长,三种算法分别得到90,31和49个特征变量,采用最小二乘支持向量机作为分类模型,将90,31和49个特征变量作为LS-SVM模型的输入变量建立分类模型,基于SPA-LS-SVM和MCVE-LS-SVM模型预测集识别率都达到了98%,而采用RF-LS-SVM建立的模型取得了较差的预测结果 ,模型预测集识别率都只是达到了88%。结果表明,SPA-LS-SVM作为分类模型优于其他模型,SPA选择的特征波长,不但可以简化模型,还可以提高模型的预测精度,基于高光谱成像技术可以用于鱼新鲜度(鱼不同冷冻时间以及冻融次数)鉴别。
文摘对葡萄酒酒精度偏最小二乘(Partial least squares,PLS)回归模型进行优化研究。使用近红外光谱仪采集葡萄酒样本的光谱数据,用于建立酒精度定量模型,实现在线快速检测。通过蒙特卡罗无信息变量消除(Monte Carlo uninformative variable elimination,MC-UVE)和遗传算法(Genetic algorithm,GA)进行变量选择,基于被选择的变量分别进行PLS和因子分析(Factor analysis,FA),建立回归模型。结果表明,MC-UVE-GA-FAR模型预测集相关系数(R2)为0.946,预测均方根误差(Root mean square error of prediction,RMSEP)为0.215,效果优于MC-UVE-GA-PLS模型。与基于全范围光谱所建PLS回归模型相比,模型效果有所提升,而且模型所选变量个数仅为6,极大地简化了模型。MC-UVE和GA算法与FA分析结合可以实现模型的优化。
文摘目的基于饲料近红外光谱数据筛选影响猪配合饲料主要品质指标的关键波长变量,从而建立饲料品质无损快速定量校正模型,进而提高饲料品质无损快速检测效率。方法采集饲料样品近红外光谱数据并获取水分、粗蛋白、粗脂肪、粗纤维参考值数据;剔除异常值后采用基于联合X-Y距离样本集划分法(sample set partitioning based on joint X-Y distance, SPXY)划分校正集和外部验证集;基于校正集数据采用蒙特卡罗-无信息变量消除-连续投影算法分别针对4个品质指标筛选25、20、15、10、5个关键变量,分别建立校正模型并对外部验证集进行预测。结果针对饲料水分、粗蛋白、粗脂肪、粗纤维所选关键变量个数分别为15、25、15、15,模型维数分别为9、11、10、9,测定系数分别为0.8288、0.8605、0.9338、0.8327,校正均方根误差分别为0.17、0.81、0.31、0.22,交互验证均方根误差分别为0.19、0.93、0.34、0.23,相对预测性能分别为2.79、2.38、4.01、2.89。结论通过变量筛选结合外部验证结果表明,在保证模型准确度的前提下,所选关键变量数明显少于全谱变量数,可为提高饲料多品质无损快速定量检测工作效率提供一定的参考。