[ Objective] Aiming at problems of early warning for occurrence of rice pests and dynamic monitoring of rice planthopper in field, a detection model for rice planthopper populations was established based on PCR with s...[ Objective] Aiming at problems of early warning for occurrence of rice pests and dynamic monitoring of rice planthopper in field, a detection model for rice planthopper populations was established based on PCR with spectrum detection technology, r Method] Canopy reflectance data were collected using FieldSpeo 3 spectrometer in paddy field, and rice planthoppers populations in hundred hills were detected simultaneously. The sample size was 71, and there were 51 samples in the calibration set and 20 samples in the prediction set. Modeling band was 350 -1 139 nm, and the original spectra were pretreated by first order differential. [ Result] The correlation coefficient of measured values and predictive values was 0. 78, and the RMSEP was 161. [ Conlmion] Spectrum detection was able to be used in investigation and forecasting of rice planthoppere.展开更多
Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of hea...Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of healthy and infected leaves by the fungus Bipolaris oryzae (Helminthosporium oryzae Breda. de Hann) through the wavelength range from 350 to 2 500 nm. The percentage of leaf surface lesions was estimated and defined as the disease severity. Statistical methods like multiple stepwise regression, principal component analysis and partial least-square regression were utilized to calculate and estimate the disease severity of rice brown spot at the leaf level. Our results revealed that multiple stepwise linear regressions could efficiently estimate disease severity with three wavebands in seven steps. The root mean square errors (RMSEs) for training (n=210) and testing (n=53) dataset were 6.5% and 5.8%, respectively. Principal component analysis showed that the first principal component could explain approximately 80% of the variance of the original hyperspectral reflectance. The regression model with the first two principal components predicted a disease severity with RMSEs of 16.3% and 13.9% for the training and testing dataset, respec-tively. Partial least-square regression with seven extracted factors could most effectively predict disease severity compared with other statistical methods with RMSEs of 4.1% and 2.0% for the training and testing dataset, respectively. Our research demon-strates that it is feasible to estimate the disease severity of rice brown spot using hyperspectral reflectance data at the leaf level.展开更多
With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistica...With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.展开更多
A combined model based on principal components analysis (PCA) and generalized regression neural network (GRNN) was adopted to forecast electricity price in day-ahead electricity market. PCA was applied to mine the mai...A combined model based on principal components analysis (PCA) and generalized regression neural network (GRNN) was adopted to forecast electricity price in day-ahead electricity market. PCA was applied to mine the main influence on day-ahead price, avoiding the strong correlation between the input factors that might influence electricity price, such as the load of the forecasting hour, other history loads and prices, weather and temperature; then GRNN was employed to forecast electricity price according to the main information extracted by PCA. To prove the efficiency of the combined model, a case from PJM (Pennsylvania-New Jersey-Maryland) day-ahead electricity market was evaluated. Compared to back-propagation (BP) neural network and standard GRNN, the combined method reduces the mean absolute percentage error about 3%.展开更多
To analyze the factors affecting the leakage rate of water distribution system, we built a macroscopic "leakage rate–leakage factors"(LRLF) model. In this model, we consider the pipe attributes(quality, dia...To analyze the factors affecting the leakage rate of water distribution system, we built a macroscopic "leakage rate–leakage factors"(LRLF) model. In this model, we consider the pipe attributes(quality, diameter,age), maintenance cost, valve replacement cost, and annual average pressure. Based on variable selection and principal component analysis results, we extracted three main principle components—the pipe attribute principal component(PAPC), operation management principal component, and water pressure principal component. Of these, we found PAPC to have the most influence. Using principal component regression, we established an LRLF model with no detectable serial correlations. The adjusted R2 and RMSE values of the model were 0.717 and 2.067, respectively.This model represents a potentially useful tool for controlling leakage rate from the macroscopic viewpoint.展开更多
In order to investigate the eutrophication degree of Yuqiao Reservoir, a hybrid method, combining principal component regression (PCR) and artificial neural network (ANN), was adopted to predict chlorophyll-a concentr...In order to investigate the eutrophication degree of Yuqiao Reservoir, a hybrid method, combining principal component regression (PCR) and artificial neural network (ANN), was adopted to predict chlorophyll-a concentration of Yuqiao Reservoir’s outflow. The data were obtained from two sampling sites, site 1 in the reservoir, and site 2 near the dam. Seven water variables, namely chlorophyll-a concentration of site 2 at time t and that of both sites 10 days before t, total phosphorus(TP), total nitrogen(TN),...展开更多
A statistical analysis was conducted on the feeding behavior of 106 York breeding pigs.Pearson correlation analysis,principal component correlation analysis and multiple stepwise regression equation methods were appli...A statistical analysis was conducted on the feeding behavior of 106 York breeding pigs.Pearson correlation analysis,principal component correlation analysis and multiple stepwise regression equation methods were applied to establish regression equations of the York breeding pigs total feed intake per time and average feed intake per time with corrected fat thickness,feed conversion rate,and corrected daily gain.The results showed that:①there were three peak feed intake periods for the pigs,and the correlation coefficient between the feed intake and the corrected fat thickness of the pigs in the 24 h period was positive or negative,that is,increasing the number of feeding times and the feed intake was not necessarily conducive to the fat thickness accumulation,but the breeding goal of fat thickness could be achieved by controlling the feeding times and feed intake;②the average feed intake of pigs in the 60-90 kg body weight stage was 30%-50%higher than that of the 30-60 kg body weight stage,but the number of feeding times decreased,the peak feeding time was more concentrated,and the feeding duration per time was 3.0 min longer,indicating that as the weight of pigs increased,the feed intake increased significantly;and③the stepwise regression equations and the principal component equations showed that the feeding behavior of York pigs in the 30-90 kg growth stage was not only affected by the feeding time within 24 h,but also by environmental factors such as temperature and humidity.The feeding behavior of York pigs is a complex process of interaction between environmental factors and animal factors.展开更多
Statistical downscaling (SD) analyzes relationship between local-scale response and global-scale predictors. The SD model can be used to forecast rainfall (local-scale) using global-scale precipitation from global cir...Statistical downscaling (SD) analyzes relationship between local-scale response and global-scale predictors. The SD model can be used to forecast rainfall (local-scale) using global-scale precipitation from global circulation model output (GCM). The objectives of this research were to determine the time lag of GCM data and build SD model using PCR method with time lag of the GCM precipitation data. The observations of rainfall data in Indramayu were taken from 1979 to 2007 showing similar patterns with GCM data on 1st grid to 64th grid after time shift (time lag). The time lag was determined using the cross-correlation function. However, GCM data of 64 grids showed multicollinearity problem. This problem was solved by principal component regression (PCR), but the PCR model resulted heterogeneous errors. PCR model was modified to overcome the errors with adding dummy variables to the model. Dummy variables were determined based on partial least squares regression (PLSR). The PCR model with dummy variables improved the rainfall prediction. The SD model with lag-GCM predictors was also better than SD model without lag-GCM.展开更多
Financial time series forecasting could be beneficial for individual as well as institutional investors. But, the high noise and complexity residing in the financial data make this job extremely challenging. Over the ...Financial time series forecasting could be beneficial for individual as well as institutional investors. But, the high noise and complexity residing in the financial data make this job extremely challenging. Over the years, many researchers have used support vector regression (SVR) quite successfully to conquer this challenge. In this paper, an SVR based forecasting model is proposed which first uses the principal component analysis (PCA) to extract the low-dimensional and efficient feature information, and then uses the independent component analysis (ICA) to preprocess the extracted features to nullify the influence of noise in the features. Experiments were carried out based on 16 years’ historical data of three prominent stocks from three different sectors listed in Dhaka Stock Exchange (DSE), Bangladesh. The predictions were made for 1 to 4 days in advance targeting the short term prediction. For comparison, the integration of PCA with SVR (PCA-SVR), ICA with SVR (ICA-SVR) and single SVR approaches were applied to evaluate the prediction accuracy of the proposed approach. Experimental results show that the proposed model (PCA-ICA-SVR) outperforms the PCA-SVR, ICA-SVR and single SVR methods.展开更多
For the two seemingly unrelated regression system, this paper proposed a new type of estimator called pre-test principal components estimator (PTPCE) and discussed some properties of PTPCE.
Possible changes in the structure and seasonal variability of the subtropical ridge may lead to changes in the rainfall’s variability modes over Caribbean region. This generates additional difficulties around water r...Possible changes in the structure and seasonal variability of the subtropical ridge may lead to changes in the rainfall’s variability modes over Caribbean region. This generates additional difficulties around water resource planning, therefore, obtaining seasonal prediction models that allow these variations to be characterized in detail, it’s a concern, specially for island states. This research proposes the construction of statistical-dynamic models based on PCA regression methods. It is used as predictand the monthly precipitation accumulated, while the predictors (6) are extracted from the ECMWF-SEAS5 ensemble mean forecasts with a lag of one month with respect to the target month. In the construction of the models, two sequential training schemes are evaluated, obtaining that only the shorter preserves the seasonal characteristics of the predictand. The evaluation metrics used, where cell-point and dichotomous methodologies are combined, suggest that the predictors related to sea surface temperatures do not adequately represent the seasonal variability of the predictand, however, others such as the temperature at 850 hPa and the Outgoing Longwave Radiation are represented with a good approximation regardless of the model chosen. In this sense, the models built with the nearest neighbor methodology were the most efficient. Using the individual models with the best results, an ensemble is built that allows improving the individual skill of the models selected as members by correcting the underestimation of precipitation in the dynamic model during the wet season, although problems of overestimation persist for thresholds lower than 50 mm.展开更多
In the application of regression analysis method to model dam deformation, the ill-condition problem occurred in coefficient matrix always prevents an accurate modeling mainly due to the multicollinearity of the varia...In the application of regression analysis method to model dam deformation, the ill-condition problem occurred in coefficient matrix always prevents an accurate modeling mainly due to the multicollinearity of the variables. Independent component regression (ICR) was proposed to model the dam deformation and identify the physical origins of the deformation. Simulation experiment shows that ICR can successfully resolve the problem of ill-condition and produce a reliable deformation model. After that, the method is applied to model the deformation of the Wuqiangxi Dam in Hunan province, China. The result shows that ICR can not only accurately model the deformation of the dam, but also help to identify the physical factors that affect the deformation through the extracted independent components.展开更多
Mineral processing plants generally have narrow tolerances for the grades of their input raw materials,so stockpiles are often maintained to reduce material variance and ensure consistency.However,designing stockpiles...Mineral processing plants generally have narrow tolerances for the grades of their input raw materials,so stockpiles are often maintained to reduce material variance and ensure consistency.However,designing stockpiles has often proven difficult when the input material consists of multiple sub-materials that have different levels of variances in their grades.In this paper,we address this issue by applying principal component analysis(PCA)to reduce the dimensions of the input data.The study was conducted in three steps.First,we applied PCA to the input data to transform them into a lower-dimension space while retaining 80% of the original variance.Next,we simulated a stockpile operation with various geometric stockpile configurations using a stockpile simulator in MATLAB.We used the variance reduction ratio as the primary criterion for evaluating the efficiency of the stockpiles.Finally,we used multiple regression to identify the relationships between stockpile efficiency and various design parameters and analyzed the regression results based on the original input variables and principal components.The results showed that PCA is indeed useful in solving a stockpile design problem that involves multiple correlated input-material grades.展开更多
By selecting the time sequence data concerning influencing factors of rural consumer demand in Hebei Province from 2000 to 2010,this paper uses the principal component analysis method in multiplex econometric statisti...By selecting the time sequence data concerning influencing factors of rural consumer demand in Hebei Province from 2000 to 2010,this paper uses the principal component analysis method in multiplex econometric statistical analysis,constructs the principal component of consumer demand in Hebei Province,conducts regression on the dependent variable of consumer spending per capita in Hebei Province and the principal component of consumer demand so as to get principal component regression,and then conducts quantitative and qualitative analysis on the principal component.The results show that total output value per capita (yuan),employment rate,and income gap,are correlative with rural residents' consumer demand in Hebei Province positively;consumer price index,upbringing ratio of children,and one-year interest rate are correlative with rural residents' consumer demand in Hebei Province negatively;the ratio of supporting the elderly and medical care spending per capita are correlative with rural residents' consumer demand in Hebei Province positively.The corresponding countermeasures and suggestions are put forward to promote residents' consumer demand in Hebei Province as follows:develop county economy in Hebei Province and increase rural residents' consumer demand;use industry to support agriculture and coordinate urban-rural development;improve rural medical care and health system and resolve actual difficulties of the masses.展开更多
Spatio-temporal assessment of the above ground biomass (AGB) is a cumbersome task due to the difficulties associated with the measurement of different tree parameters such as girth at breast height and height of tre...Spatio-temporal assessment of the above ground biomass (AGB) is a cumbersome task due to the difficulties associated with the measurement of different tree parameters such as girth at breast height and height of trees. The present research was conducted in the campus of Birla Institute of Technology, Mesra, Ranchi, India, which is predomi- nantly covered by Sal (Shorea robusta C. F. Gaertn). Two methods of regression analysis was employed to determine the potential of remote sensing parameters with the AGB measured in the field such as linear regression analysis between the AGB and the individual bands, principal components (PCs) of the bands, vegetation indices (VI), and the PCs of the VIs respectively and multiple linear regression (MLR) analysis be- tween the AGB and all the variables in each category of data. From the linear regression analysis, it was found that only the NDVI exhibited regression coefficient value above 0.80 with the remaining parameters showing very low values. On the other hand, the MLR based analysis revealed significantly improved results as evidenced by the occurrence of very high correlation coefficient values of greater than 0.90 determined between the computed AGB from the MLR equations and field-estimated AGB thereby ascertaining their superiority in providing reliable estimates of AGB. The highest correlation coefficient of 0.99 is found with the MLR involving PCs of VIs.展开更多
Objective: To introduce a method to calculate cardiovascular age, a new, accurate and much simpler index for assessing cardiovascular autonomic regulatory function, based on statistical analysis of heart rate and bloo...Objective: To introduce a method to calculate cardiovascular age, a new, accurate and much simpler index for assessing cardiovascular autonomic regulatory function, based on statistical analysis of heart rate and blood pressure variability (HRV and BPV) and baroreflex sensitivity (BRS) data. Methods: Firstly, HRV and BPV of 89 healthy aviation personnel were analyzed by the conventional autoregressive (AR) spectral analysis and their spontaneous BRS was obtained by the sequence method. Secondly, principal component analysis was conducted over original and derived indices of HRV, BPV and BRS data and the relevant principal components, PCi orig and PCi deri (i=1, 2, 3,...) were obtained. Finally, the equation for calculating cardiovascular age was obtained by multiple regression with the chronological age being assigned as the dependent variable and the principal components significantly related to age as the regressors. Results: The first four principal components of original indices accounted for over 90% of total variance of the indices, so did the first three principal components of derived indices. So, these seven principal components could reflect the information of cardiovascular autonomic regulation which was embodied in the 17 indices of HRV, BPV and BRS exactly with a minimal loss of information. Of the seven principal components, PC2 orig , PC4 orig and PC2 deri were negatively correlated with the chronological age ( P <0 05), whereas the PC3 orig was positively correlated with the chronological age ( P <0 01). The cardiovascular age thus calculated from the regression equation was significantly correlated with the chronological age among the 89 aviation personnel ( r =0.73, P <0 01). Conclusion: The cardiovascular age calculated based on a multi variate analysis of HRV, BPV and BRS could be regarded as a comprehensive indicator reflecting the age dependency of autonomic regulation of cardiovascular system in healthy aviation personnel.展开更多
There are a variety of classification techniques such as neural network, decision tree, support vector machine and logistic regression. The problem of dimensionality is pertinent to many learning algorithms, and it de...There are a variety of classification techniques such as neural network, decision tree, support vector machine and logistic regression. The problem of dimensionality is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity, however, we need to use dimensionality reduction methods. These methods include principal component analysis (PCA) and locality preserving projection (LPP). In many real-world classification problems, the local structure is more important than the global structure and dimensionality reduction techniques ignore the local structure and preserve the global structure. The objectives is to compare PCA and LPP in terms of accuracy, to develop appropriate representations of complex data by reducing the dimensions of the data and to explain the importance of using LPP with logistic regression. The results of this paper find that the proposed LPP approach provides a better representation and high accuracy than the PCA approach.展开更多
As the market competition of steel mills is severe,deoxidization alloying is an important link in the metallurgical process.To solve this problem,principal component regression analysis is adopted to reduce the dimens...As the market competition of steel mills is severe,deoxidization alloying is an important link in the metallurgical process.To solve this problem,principal component regression analysis is adopted to reduce the dimension of influencing factors,and a reasonable and reliable prediction model of element yield is established.Based on the constraint conditions such as target cost function constraint,yield constraint and non-negative constraint,linear programming is adopted to design the lowest cost batting scheme that meets the national standards and production requirements.The research results provide a reliable optimization model for the deoxidization and alloying process of steel mills,which is of positive significance for improving the market competitiveness of steel mills,reducing waste discharge and protecting the environment.展开更多
基金Supported by Open Fund Project in Key Laboratory of Modern Agricultural Equipment and Technology,Ministry of Education Key Laboratory of Jiangsu Province(NZ200803)~~
文摘[ Objective] Aiming at problems of early warning for occurrence of rice pests and dynamic monitoring of rice planthopper in field, a detection model for rice planthopper populations was established based on PCR with spectrum detection technology, r Method] Canopy reflectance data were collected using FieldSpeo 3 spectrometer in paddy field, and rice planthoppers populations in hundred hills were detected simultaneously. The sample size was 71, and there were 51 samples in the calibration set and 20 samples in the prediction set. Modeling band was 350 -1 139 nm, and the original spectra were pretreated by first order differential. [ Result] The correlation coefficient of measured values and predictive values was 0. 78, and the RMSEP was 161. [ Conlmion] Spectrum detection was able to be used in investigation and forecasting of rice planthoppere.
基金the Hi-Tech Research and Development Program (863) of China (No. 2006AA10Z203)the National Scienceand Technology Task Force Project (No. 2006BAD10A01), China
文摘Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of healthy and infected leaves by the fungus Bipolaris oryzae (Helminthosporium oryzae Breda. de Hann) through the wavelength range from 350 to 2 500 nm. The percentage of leaf surface lesions was estimated and defined as the disease severity. Statistical methods like multiple stepwise regression, principal component analysis and partial least-square regression were utilized to calculate and estimate the disease severity of rice brown spot at the leaf level. Our results revealed that multiple stepwise linear regressions could efficiently estimate disease severity with three wavebands in seven steps. The root mean square errors (RMSEs) for training (n=210) and testing (n=53) dataset were 6.5% and 5.8%, respectively. Principal component analysis showed that the first principal component could explain approximately 80% of the variance of the original hyperspectral reflectance. The regression model with the first two principal components predicted a disease severity with RMSEs of 16.3% and 13.9% for the training and testing dataset, respec-tively. Partial least-square regression with seven extracted factors could most effectively predict disease severity compared with other statistical methods with RMSEs of 4.1% and 2.0% for the training and testing dataset, respectively. Our research demon-strates that it is feasible to estimate the disease severity of rice brown spot using hyperspectral reflectance data at the leaf level.
基金founded by the National Natural Science Foundation of China(81202283,81473070,81373102 and81202267)Key Grant of Natural Science Foundation of the Jiangsu Higher Education Institutions of China(10KJA330034 and11KJA330001)+1 种基金the Research Fund for the Doctoral Program of Higher Education of China(20113234110002)the Priority Academic Program for the Development of Jiangsu Higher Education Institutions(Public Health and Preventive Medicine)
文摘With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.
基金Project(70671039) supported by the National Natural Science Foundation of China
文摘A combined model based on principal components analysis (PCA) and generalized regression neural network (GRNN) was adopted to forecast electricity price in day-ahead electricity market. PCA was applied to mine the main influence on day-ahead price, avoiding the strong correlation between the input factors that might influence electricity price, such as the load of the forecasting hour, other history loads and prices, weather and temperature; then GRNN was employed to forecast electricity price according to the main information extracted by PCA. To prove the efficiency of the combined model, a case from PJM (Pennsylvania-New Jersey-Maryland) day-ahead electricity market was evaluated. Compared to back-propagation (BP) neural network and standard GRNN, the combined method reduces the mean absolute percentage error about 3%.
基金supported by the Ministry of Science and Technology of China (No.2014ZX07203-009)the Fundamental Research Funds for the Central Universitiesthe Program for New Century Excellent Talents at the University of China
文摘To analyze the factors affecting the leakage rate of water distribution system, we built a macroscopic "leakage rate–leakage factors"(LRLF) model. In this model, we consider the pipe attributes(quality, diameter,age), maintenance cost, valve replacement cost, and annual average pressure. Based on variable selection and principal component analysis results, we extracted three main principle components—the pipe attribute principal component(PAPC), operation management principal component, and water pressure principal component. Of these, we found PAPC to have the most influence. Using principal component regression, we established an LRLF model with no detectable serial correlations. The adjusted R2 and RMSE values of the model were 0.717 and 2.067, respectively.This model represents a potentially useful tool for controlling leakage rate from the macroscopic viewpoint.
文摘In order to investigate the eutrophication degree of Yuqiao Reservoir, a hybrid method, combining principal component regression (PCR) and artificial neural network (ANN), was adopted to predict chlorophyll-a concentration of Yuqiao Reservoir’s outflow. The data were obtained from two sampling sites, site 1 in the reservoir, and site 2 near the dam. Seven water variables, namely chlorophyll-a concentration of site 2 at time t and that of both sites 10 days before t, total phosphorus(TP), total nitrogen(TN),...
文摘A statistical analysis was conducted on the feeding behavior of 106 York breeding pigs.Pearson correlation analysis,principal component correlation analysis and multiple stepwise regression equation methods were applied to establish regression equations of the York breeding pigs total feed intake per time and average feed intake per time with corrected fat thickness,feed conversion rate,and corrected daily gain.The results showed that:①there were three peak feed intake periods for the pigs,and the correlation coefficient between the feed intake and the corrected fat thickness of the pigs in the 24 h period was positive or negative,that is,increasing the number of feeding times and the feed intake was not necessarily conducive to the fat thickness accumulation,but the breeding goal of fat thickness could be achieved by controlling the feeding times and feed intake;②the average feed intake of pigs in the 60-90 kg body weight stage was 30%-50%higher than that of the 30-60 kg body weight stage,but the number of feeding times decreased,the peak feeding time was more concentrated,and the feeding duration per time was 3.0 min longer,indicating that as the weight of pigs increased,the feed intake increased significantly;and③the stepwise regression equations and the principal component equations showed that the feeding behavior of York pigs in the 30-90 kg growth stage was not only affected by the feeding time within 24 h,but also by environmental factors such as temperature and humidity.The feeding behavior of York pigs is a complex process of interaction between environmental factors and animal factors.
文摘Statistical downscaling (SD) analyzes relationship between local-scale response and global-scale predictors. The SD model can be used to forecast rainfall (local-scale) using global-scale precipitation from global circulation model output (GCM). The objectives of this research were to determine the time lag of GCM data and build SD model using PCR method with time lag of the GCM precipitation data. The observations of rainfall data in Indramayu were taken from 1979 to 2007 showing similar patterns with GCM data on 1st grid to 64th grid after time shift (time lag). The time lag was determined using the cross-correlation function. However, GCM data of 64 grids showed multicollinearity problem. This problem was solved by principal component regression (PCR), but the PCR model resulted heterogeneous errors. PCR model was modified to overcome the errors with adding dummy variables to the model. Dummy variables were determined based on partial least squares regression (PLSR). The PCR model with dummy variables improved the rainfall prediction. The SD model with lag-GCM predictors was also better than SD model without lag-GCM.
文摘Financial time series forecasting could be beneficial for individual as well as institutional investors. But, the high noise and complexity residing in the financial data make this job extremely challenging. Over the years, many researchers have used support vector regression (SVR) quite successfully to conquer this challenge. In this paper, an SVR based forecasting model is proposed which first uses the principal component analysis (PCA) to extract the low-dimensional and efficient feature information, and then uses the independent component analysis (ICA) to preprocess the extracted features to nullify the influence of noise in the features. Experiments were carried out based on 16 years’ historical data of three prominent stocks from three different sectors listed in Dhaka Stock Exchange (DSE), Bangladesh. The predictions were made for 1 to 4 days in advance targeting the short term prediction. For comparison, the integration of PCA with SVR (PCA-SVR), ICA with SVR (ICA-SVR) and single SVR approaches were applied to evaluate the prediction accuracy of the proposed approach. Experimental results show that the proposed model (PCA-ICA-SVR) outperforms the PCA-SVR, ICA-SVR and single SVR methods.
文摘For the two seemingly unrelated regression system, this paper proposed a new type of estimator called pre-test principal components estimator (PTPCE) and discussed some properties of PTPCE.
文摘Possible changes in the structure and seasonal variability of the subtropical ridge may lead to changes in the rainfall’s variability modes over Caribbean region. This generates additional difficulties around water resource planning, therefore, obtaining seasonal prediction models that allow these variations to be characterized in detail, it’s a concern, specially for island states. This research proposes the construction of statistical-dynamic models based on PCA regression methods. It is used as predictand the monthly precipitation accumulated, while the predictors (6) are extracted from the ECMWF-SEAS5 ensemble mean forecasts with a lag of one month with respect to the target month. In the construction of the models, two sequential training schemes are evaluated, obtaining that only the shorter preserves the seasonal characteristics of the predictand. The evaluation metrics used, where cell-point and dichotomous methodologies are combined, suggest that the predictors related to sea surface temperatures do not adequately represent the seasonal variability of the predictand, however, others such as the temperature at 850 hPa and the Outgoing Longwave Radiation are represented with a good approximation regardless of the model chosen. In this sense, the models built with the nearest neighbor methodology were the most efficient. Using the individual models with the best results, an ensemble is built that allows improving the individual skill of the models selected as members by correcting the underestimation of precipitation in the dynamic model during the wet season, although problems of overestimation persist for thresholds lower than 50 mm.
基金Project(41074004)supported by the National Natural Science Foundation of ChinaProject(2013CB733303)supported by the National Basic Research Program of China
文摘In the application of regression analysis method to model dam deformation, the ill-condition problem occurred in coefficient matrix always prevents an accurate modeling mainly due to the multicollinearity of the variables. Independent component regression (ICR) was proposed to model the dam deformation and identify the physical origins of the deformation. Simulation experiment shows that ICR can successfully resolve the problem of ill-condition and produce a reliable deformation model. After that, the method is applied to model the deformation of the Wuqiangxi Dam in Hunan province, China. The result shows that ICR can not only accurately model the deformation of the dam, but also help to identify the physical factors that affect the deformation through the extracted independent components.
文摘Mineral processing plants generally have narrow tolerances for the grades of their input raw materials,so stockpiles are often maintained to reduce material variance and ensure consistency.However,designing stockpiles has often proven difficult when the input material consists of multiple sub-materials that have different levels of variances in their grades.In this paper,we address this issue by applying principal component analysis(PCA)to reduce the dimensions of the input data.The study was conducted in three steps.First,we applied PCA to the input data to transform them into a lower-dimension space while retaining 80% of the original variance.Next,we simulated a stockpile operation with various geometric stockpile configurations using a stockpile simulator in MATLAB.We used the variance reduction ratio as the primary criterion for evaluating the efficiency of the stockpiles.Finally,we used multiple regression to identify the relationships between stockpile efficiency and various design parameters and analyzed the regression results based on the original input variables and principal components.The results showed that PCA is indeed useful in solving a stockpile design problem that involves multiple correlated input-material grades.
基金Supported by Hebei Province Regional Economic Development Countermeasures Research Program (Fs201010)
文摘By selecting the time sequence data concerning influencing factors of rural consumer demand in Hebei Province from 2000 to 2010,this paper uses the principal component analysis method in multiplex econometric statistical analysis,constructs the principal component of consumer demand in Hebei Province,conducts regression on the dependent variable of consumer spending per capita in Hebei Province and the principal component of consumer demand so as to get principal component regression,and then conducts quantitative and qualitative analysis on the principal component.The results show that total output value per capita (yuan),employment rate,and income gap,are correlative with rural residents' consumer demand in Hebei Province positively;consumer price index,upbringing ratio of children,and one-year interest rate are correlative with rural residents' consumer demand in Hebei Province negatively;the ratio of supporting the elderly and medical care spending per capita are correlative with rural residents' consumer demand in Hebei Province positively.The corresponding countermeasures and suggestions are put forward to promote residents' consumer demand in Hebei Province as follows:develop county economy in Hebei Province and increase rural residents' consumer demand;use industry to support agriculture and coordinate urban-rural development;improve rural medical care and health system and resolve actual difficulties of the masses.
文摘Spatio-temporal assessment of the above ground biomass (AGB) is a cumbersome task due to the difficulties associated with the measurement of different tree parameters such as girth at breast height and height of trees. The present research was conducted in the campus of Birla Institute of Technology, Mesra, Ranchi, India, which is predomi- nantly covered by Sal (Shorea robusta C. F. Gaertn). Two methods of regression analysis was employed to determine the potential of remote sensing parameters with the AGB measured in the field such as linear regression analysis between the AGB and the individual bands, principal components (PCs) of the bands, vegetation indices (VI), and the PCs of the VIs respectively and multiple linear regression (MLR) analysis be- tween the AGB and all the variables in each category of data. From the linear regression analysis, it was found that only the NDVI exhibited regression coefficient value above 0.80 with the remaining parameters showing very low values. On the other hand, the MLR based analysis revealed significantly improved results as evidenced by the occurrence of very high correlation coefficient values of greater than 0.90 determined between the computed AGB from the MLR equations and field-estimated AGB thereby ascertaining their superiority in providing reliable estimates of AGB. The highest correlation coefficient of 0.99 is found with the MLR involving PCs of VIs.
文摘Objective: To introduce a method to calculate cardiovascular age, a new, accurate and much simpler index for assessing cardiovascular autonomic regulatory function, based on statistical analysis of heart rate and blood pressure variability (HRV and BPV) and baroreflex sensitivity (BRS) data. Methods: Firstly, HRV and BPV of 89 healthy aviation personnel were analyzed by the conventional autoregressive (AR) spectral analysis and their spontaneous BRS was obtained by the sequence method. Secondly, principal component analysis was conducted over original and derived indices of HRV, BPV and BRS data and the relevant principal components, PCi orig and PCi deri (i=1, 2, 3,...) were obtained. Finally, the equation for calculating cardiovascular age was obtained by multiple regression with the chronological age being assigned as the dependent variable and the principal components significantly related to age as the regressors. Results: The first four principal components of original indices accounted for over 90% of total variance of the indices, so did the first three principal components of derived indices. So, these seven principal components could reflect the information of cardiovascular autonomic regulation which was embodied in the 17 indices of HRV, BPV and BRS exactly with a minimal loss of information. Of the seven principal components, PC2 orig , PC4 orig and PC2 deri were negatively correlated with the chronological age ( P <0 05), whereas the PC3 orig was positively correlated with the chronological age ( P <0 01). The cardiovascular age thus calculated from the regression equation was significantly correlated with the chronological age among the 89 aviation personnel ( r =0.73, P <0 01). Conclusion: The cardiovascular age calculated based on a multi variate analysis of HRV, BPV and BRS could be regarded as a comprehensive indicator reflecting the age dependency of autonomic regulation of cardiovascular system in healthy aviation personnel.
文摘There are a variety of classification techniques such as neural network, decision tree, support vector machine and logistic regression. The problem of dimensionality is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity, however, we need to use dimensionality reduction methods. These methods include principal component analysis (PCA) and locality preserving projection (LPP). In many real-world classification problems, the local structure is more important than the global structure and dimensionality reduction techniques ignore the local structure and preserve the global structure. The objectives is to compare PCA and LPP in terms of accuracy, to develop appropriate representations of complex data by reducing the dimensions of the data and to explain the importance of using LPP with logistic regression. The results of this paper find that the proposed LPP approach provides a better representation and high accuracy than the PCA approach.
文摘As the market competition of steel mills is severe,deoxidization alloying is an important link in the metallurgical process.To solve this problem,principal component regression analysis is adopted to reduce the dimension of influencing factors,and a reasonable and reliable prediction model of element yield is established.Based on the constraint conditions such as target cost function constraint,yield constraint and non-negative constraint,linear programming is adopted to design the lowest cost batting scheme that meets the national standards and production requirements.The research results provide a reliable optimization model for the deoxidization and alloying process of steel mills,which is of positive significance for improving the market competitiveness of steel mills,reducing waste discharge and protecting the environment.