Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of hea...Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of healthy and infected leaves by the fungus Bipolaris oryzae (Helminthosporium oryzae Breda. de Hann) through the wavelength range from 350 to 2 500 nm. The percentage of leaf surface lesions was estimated and defined as the disease severity. Statistical methods like multiple stepwise regression, principal component analysis and partial least-square regression were utilized to calculate and estimate the disease severity of rice brown spot at the leaf level. Our results revealed that multiple stepwise linear regressions could efficiently estimate disease severity with three wavebands in seven steps. The root mean square errors (RMSEs) for training (n=210) and testing (n=53) dataset were 6.5% and 5.8%, respectively. Principal component analysis showed that the first principal component could explain approximately 80% of the variance of the original hyperspectral reflectance. The regression model with the first two principal components predicted a disease severity with RMSEs of 16.3% and 13.9% for the training and testing dataset, respec-tively. Partial least-square regression with seven extracted factors could most effectively predict disease severity compared with other statistical methods with RMSEs of 4.1% and 2.0% for the training and testing dataset, respectively. Our research demon-strates that it is feasible to estimate the disease severity of rice brown spot using hyperspectral reflectance data at the leaf level.展开更多
With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistica...With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.展开更多
This study aimed to investigate microbial succession and metabolic dynamics during the traditional fermentation of Hongqu aged vinegar,and explore the core functional microbes closely related to the formation of flavo...This study aimed to investigate microbial succession and metabolic dynamics during the traditional fermentation of Hongqu aged vinegar,and explore the core functional microbes closely related to the formation of flavor components.Microbiome analysis demonstrated that Lactobacillus,Acetobacter,Bacillus,Enterobacter,Lactococcus,Leuconostoc and Weissella were the predominant bacterial genera,while Aspergillus piperis,Aspergillus oryzae,Monascus purpureus,Candida athensensis,C.xylopsoci,Penicillium ochrosalmoneum and Simplicillium aogashimaense were the predominant fungal species.Correlation analysis revealed that Acetobacter was positively correlated with the production of tetramethylpyrazine,acetoin and acetic acid,Lactococcus showed positive correlation with the production of 2-nonanone,2-heptanone,ethyl caprylate,ethyl caprate,1-hexanol,1-octanol and 1-octen-3-ol,C.xylopsoci and C.rugosa were positively associated with the production of diethyl malonate,2,3-butanediyl diacetate,acetoin,benzaldehyde and tetramethylpyrazine.Correspondingly,non-volatile metabolites were also detected through ultra-performance liquid chromatography-quadrupole time-of-flight mass spectrometry.A variety of amino acids and functional dipeptides were identified during the traditional brewing of Hongqu aged vinegar.Correlation analysis revealed that Lactobacillus was significantly associated with DL-lactate,indolelactic acid,D-(+)-3-phenyllactic acid,pimelic acid,pregabalin and 3-aminobutanoic acid.This study is useful for understanding flavor formation mechanism and developing effective strategies for the suitable strains selection to improve the flavor quality of Hongqu aged vinegar.展开更多
The identification of liquor brands is very important for food safety. Most of the fake liquors are usually made into the products with the same flavor and alcohol content as regular brand, so the identification for t...The identification of liquor brands is very important for food safety. Most of the fake liquors are usually made into the products with the same flavor and alcohol content as regular brand, so the identification for the liquor brands with the same flavor and the same alcohol content is essential. However, it is also difficult because the components of such liquor samples are very similar. Near-infrared (NIR) spectroscopy combined with partial least squares discriminant analysis (PLS-DA) was applied to identification of liquor brands with the same flavor and alcohol content. A total of 160 samples of Luzhou Laojiao liquor and 200 samples of non-Luzhou Laojiao liquor with the same flavor and alcohol content were used for identification. Samples of each type were randomly divided into the modeling and validation sets. The modeling samples were further divided into calibration and prediction sets using the Kennard-Stone algorithm to achieve uniformity and representativeness. In the modeling and validation processes based on PLS-DA method, the recognition rates of samples achieved 99.1% and 98.7%, respectively. The results show high prediction performance for the identification of liquor brands, and were obviously better than those obtained from the principal component linear discriminant analysis method. NIR spectroscopy combined with the PLS-DA method provides a quick and effective means of the discriminant analysis of liquor brands, and is also a promising tool for large-scale inspection of liquor food safety.展开更多
Statistical downscaling (SD) analyzes relationship between local-scale response and global-scale predictors. The SD model can be used to forecast rainfall (local-scale) using global-scale precipitation from global cir...Statistical downscaling (SD) analyzes relationship between local-scale response and global-scale predictors. The SD model can be used to forecast rainfall (local-scale) using global-scale precipitation from global circulation model output (GCM). The objectives of this research were to determine the time lag of GCM data and build SD model using PCR method with time lag of the GCM precipitation data. The observations of rainfall data in Indramayu were taken from 1979 to 2007 showing similar patterns with GCM data on 1st grid to 64th grid after time shift (time lag). The time lag was determined using the cross-correlation function. However, GCM data of 64 grids showed multicollinearity problem. This problem was solved by principal component regression (PCR), but the PCR model resulted heterogeneous errors. PCR model was modified to overcome the errors with adding dummy variables to the model. Dummy variables were determined based on partial least squares regression (PLSR). The PCR model with dummy variables improved the rainfall prediction. The SD model with lag-GCM predictors was also better than SD model without lag-GCM.展开更多
高温燃气红外光谱特征是判断燃气成分和浓度的有效途径。针对高温燃气红外辐射特性复杂、建模难度高的问题,研究了一种基于间隔偏最小二乘(interval Partial Least Squares,iPLS)和核主成分分析(Kernel Principal Component Analysis,KP...高温燃气红外光谱特征是判断燃气成分和浓度的有效途径。针对高温燃气红外辐射特性复杂、建模难度高的问题,研究了一种基于间隔偏最小二乘(interval Partial Least Squares,iPLS)和核主成分分析(Kernel Principal Component Analysis,KPCA)的特征提取算法。首先通过iPLS进行预筛选,确定具有最优预测能力的特征光谱波段,避免单个子区间建模过程中有用吸收峰信息的遗失;其次,利用KPCA降低数据维度,保留贡献率高的关键特征,降低成分预测模型的复杂度。仿真结果表明,经过iPLS-KPCA方法特征提取后,预测模型的复杂度大幅下降,且预测能力显著提升。展开更多
In blast furnace (BF) iron-making process, the hot metal silicon content was usually used to measure the quality of hot metal and to reflect the thermal state of BF. Principal component analysis (PCA) and partial ...In blast furnace (BF) iron-making process, the hot metal silicon content was usually used to measure the quality of hot metal and to reflect the thermal state of BF. Principal component analysis (PCA) and partial least- square (PLS) regression methods were used to predict the hot metal silicon content. Under the conditions of BF rela- tively stable situation, PCA and PLS regression models of hot metal silicon content utilizing data from Baotou Steel No. 6 BF were established, which provided the accuracy of 88.4% and 89.2%. PLS model used less variables and time than principal component analysis model, and it was simple to calculate. It is shown that the model gives good results and is helpful for practical production.展开更多
基金the Hi-Tech Research and Development Program (863) of China (No. 2006AA10Z203)the National Scienceand Technology Task Force Project (No. 2006BAD10A01), China
文摘Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of healthy and infected leaves by the fungus Bipolaris oryzae (Helminthosporium oryzae Breda. de Hann) through the wavelength range from 350 to 2 500 nm. The percentage of leaf surface lesions was estimated and defined as the disease severity. Statistical methods like multiple stepwise regression, principal component analysis and partial least-square regression were utilized to calculate and estimate the disease severity of rice brown spot at the leaf level. Our results revealed that multiple stepwise linear regressions could efficiently estimate disease severity with three wavebands in seven steps. The root mean square errors (RMSEs) for training (n=210) and testing (n=53) dataset were 6.5% and 5.8%, respectively. Principal component analysis showed that the first principal component could explain approximately 80% of the variance of the original hyperspectral reflectance. The regression model with the first two principal components predicted a disease severity with RMSEs of 16.3% and 13.9% for the training and testing dataset, respec-tively. Partial least-square regression with seven extracted factors could most effectively predict disease severity compared with other statistical methods with RMSEs of 4.1% and 2.0% for the training and testing dataset, respectively. Our research demon-strates that it is feasible to estimate the disease severity of rice brown spot using hyperspectral reflectance data at the leaf level.
基金founded by the National Natural Science Foundation of China(81202283,81473070,81373102 and81202267)Key Grant of Natural Science Foundation of the Jiangsu Higher Education Institutions of China(10KJA330034 and11KJA330001)+1 种基金the Research Fund for the Doctoral Program of Higher Education of China(20113234110002)the Priority Academic Program for the Development of Jiangsu Higher Education Institutions(Public Health and Preventive Medicine)
文摘With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.
基金funded by Outstanding Talent of“Qishan Scholar”of Fuzhou University of China(GXRC21049)the Open Project Program of the Beijing Laboratory of Food Quality and Safety,Beijing Technology and Business University(BTBU)(FQS-201802,FQS-202008).
文摘This study aimed to investigate microbial succession and metabolic dynamics during the traditional fermentation of Hongqu aged vinegar,and explore the core functional microbes closely related to the formation of flavor components.Microbiome analysis demonstrated that Lactobacillus,Acetobacter,Bacillus,Enterobacter,Lactococcus,Leuconostoc and Weissella were the predominant bacterial genera,while Aspergillus piperis,Aspergillus oryzae,Monascus purpureus,Candida athensensis,C.xylopsoci,Penicillium ochrosalmoneum and Simplicillium aogashimaense were the predominant fungal species.Correlation analysis revealed that Acetobacter was positively correlated with the production of tetramethylpyrazine,acetoin and acetic acid,Lactococcus showed positive correlation with the production of 2-nonanone,2-heptanone,ethyl caprylate,ethyl caprate,1-hexanol,1-octanol and 1-octen-3-ol,C.xylopsoci and C.rugosa were positively associated with the production of diethyl malonate,2,3-butanediyl diacetate,acetoin,benzaldehyde and tetramethylpyrazine.Correspondingly,non-volatile metabolites were also detected through ultra-performance liquid chromatography-quadrupole time-of-flight mass spectrometry.A variety of amino acids and functional dipeptides were identified during the traditional brewing of Hongqu aged vinegar.Correlation analysis revealed that Lactobacillus was significantly associated with DL-lactate,indolelactic acid,D-(+)-3-phenyllactic acid,pimelic acid,pregabalin and 3-aminobutanoic acid.This study is useful for understanding flavor formation mechanism and developing effective strategies for the suitable strains selection to improve the flavor quality of Hongqu aged vinegar.
文摘The identification of liquor brands is very important for food safety. Most of the fake liquors are usually made into the products with the same flavor and alcohol content as regular brand, so the identification for the liquor brands with the same flavor and the same alcohol content is essential. However, it is also difficult because the components of such liquor samples are very similar. Near-infrared (NIR) spectroscopy combined with partial least squares discriminant analysis (PLS-DA) was applied to identification of liquor brands with the same flavor and alcohol content. A total of 160 samples of Luzhou Laojiao liquor and 200 samples of non-Luzhou Laojiao liquor with the same flavor and alcohol content were used for identification. Samples of each type were randomly divided into the modeling and validation sets. The modeling samples were further divided into calibration and prediction sets using the Kennard-Stone algorithm to achieve uniformity and representativeness. In the modeling and validation processes based on PLS-DA method, the recognition rates of samples achieved 99.1% and 98.7%, respectively. The results show high prediction performance for the identification of liquor brands, and were obviously better than those obtained from the principal component linear discriminant analysis method. NIR spectroscopy combined with the PLS-DA method provides a quick and effective means of the discriminant analysis of liquor brands, and is also a promising tool for large-scale inspection of liquor food safety.
文摘Statistical downscaling (SD) analyzes relationship between local-scale response and global-scale predictors. The SD model can be used to forecast rainfall (local-scale) using global-scale precipitation from global circulation model output (GCM). The objectives of this research were to determine the time lag of GCM data and build SD model using PCR method with time lag of the GCM precipitation data. The observations of rainfall data in Indramayu were taken from 1979 to 2007 showing similar patterns with GCM data on 1st grid to 64th grid after time shift (time lag). The time lag was determined using the cross-correlation function. However, GCM data of 64 grids showed multicollinearity problem. This problem was solved by principal component regression (PCR), but the PCR model resulted heterogeneous errors. PCR model was modified to overcome the errors with adding dummy variables to the model. Dummy variables were determined based on partial least squares regression (PLSR). The PCR model with dummy variables improved the rainfall prediction. The SD model with lag-GCM predictors was also better than SD model without lag-GCM.
文摘高温燃气红外光谱特征是判断燃气成分和浓度的有效途径。针对高温燃气红外辐射特性复杂、建模难度高的问题,研究了一种基于间隔偏最小二乘(interval Partial Least Squares,iPLS)和核主成分分析(Kernel Principal Component Analysis,KPCA)的特征提取算法。首先通过iPLS进行预筛选,确定具有最优预测能力的特征光谱波段,避免单个子区间建模过程中有用吸收峰信息的遗失;其次,利用KPCA降低数据维度,保留贡献率高的关键特征,降低成分预测模型的复杂度。仿真结果表明,经过iPLS-KPCA方法特征提取后,预测模型的复杂度大幅下降,且预测能力显著提升。
基金Item Sponsored by National Natural Science Foundation of China(51064019)Natural Science Foundation of Inner Mongolia of China(20010MS0911,NJzy08075)
文摘In blast furnace (BF) iron-making process, the hot metal silicon content was usually used to measure the quality of hot metal and to reflect the thermal state of BF. Principal component analysis (PCA) and partial least- square (PLS) regression methods were used to predict the hot metal silicon content. Under the conditions of BF rela- tively stable situation, PCA and PLS regression models of hot metal silicon content utilizing data from Baotou Steel No. 6 BF were established, which provided the accuracy of 88.4% and 89.2%. PLS model used less variables and time than principal component analysis model, and it was simple to calculate. It is shown that the model gives good results and is helpful for practical production.