With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistica...With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.展开更多
Spatio-temporal assessment of the above ground biomass (AGB) is a cumbersome task due to the difficulties associated with the measurement of different tree parameters such as girth at breast height and height of tre...Spatio-temporal assessment of the above ground biomass (AGB) is a cumbersome task due to the difficulties associated with the measurement of different tree parameters such as girth at breast height and height of trees. The present research was conducted in the campus of Birla Institute of Technology, Mesra, Ranchi, India, which is predomi- nantly covered by Sal (Shorea robusta C. F. Gaertn). Two methods of regression analysis was employed to determine the potential of remote sensing parameters with the AGB measured in the field such as linear regression analysis between the AGB and the individual bands, principal components (PCs) of the bands, vegetation indices (VI), and the PCs of the VIs respectively and multiple linear regression (MLR) analysis be- tween the AGB and all the variables in each category of data. From the linear regression analysis, it was found that only the NDVI exhibited regression coefficient value above 0.80 with the remaining parameters showing very low values. On the other hand, the MLR based analysis revealed significantly improved results as evidenced by the occurrence of very high correlation coefficient values of greater than 0.90 determined between the computed AGB from the MLR equations and field-estimated AGB thereby ascertaining their superiority in providing reliable estimates of AGB. The highest correlation coefficient of 0.99 is found with the MLR involving PCs of VIs.展开更多
The key to studying urban sustainable development depends on quantifying stores, efficiencies of urban metabolisms and capturing urban metabolisms′ mechanisms. This paper builds up the metabolic emergy account and qu...The key to studying urban sustainable development depends on quantifying stores, efficiencies of urban metabolisms and capturing urban metabolisms′ mechanisms. This paper builds up the metabolic emergy account and quantifies some important concepts of emergy stores. Emphasis is placed on the urban metabolic model based on the slack based model(SBM) method to measure urban metabolic efficiencies. Urban metabolic mechanisms are discussed by using the regression method. By integrating these models, this paper analyzes the urban metabolic development in Beijing from 2001 to 2010. We conclude that the metabolic emergy stores of Beijing increased significantly from 2001 to 2010, with the emergy imported accounting for most of the increase. The metabolic efficiencies in Beijing have improved since the 2008 Olympic Games. The population, economic growth, industrial structures, and environmental governance positively affect the overall urban metabolism, while the land expansion, urbanization and environmentally technical levels hinder the improving of urban metabolic efficiencies. The SBM metabolic method and the regression model based on the emergy analysis provide insights into the urban metabolic efficiencies and the mechanism. They can promote to integrate such concepts into their sustainability analyses and policy decisions.展开更多
Glass is the precious material evidence of the trade of the early Silk Road. The ancient glass was easily affected by the environmental impact and weathering, and the change of composition ratios affected the correct ...Glass is the precious material evidence of the trade of the early Silk Road. The ancient glass was easily affected by the environmental impact and weathering, and the change of composition ratios affected the correct judgment of its category. In this paper, mathematical models and methods such as Chi-square test, weighted average method, principal component analysis, cluster analysis, binary classification model and grey correlation analysis were used comprehensively to analyze the data of sample glass products combined with their categories. The results showed that the weathered high-potassium glass could be divided into 12, 9, 10 and 27, 7, 22 and so on.展开更多
In this paper we aim to analyse temporal variation of CD4 cell counts for HIV-infected individuals under antiretroviral therapy by using statistical methods. This is achieved by resorting to recursive binary regressio...In this paper we aim to analyse temporal variation of CD4 cell counts for HIV-infected individuals under antiretroviral therapy by using statistical methods. This is achieved by resorting to recursive binary regression tree approach [1]?[2]. This approach has made it possible to highlight the existence of several segments of the population of interest described by the interactions between the predictive covariates of the response to the treatment regimen.展开更多
The objective of this paper is to present a review of different calibration and classification methods for functional data in the context of chemometric applications. In chemometric, it is usual to measure certain par...The objective of this paper is to present a review of different calibration and classification methods for functional data in the context of chemometric applications. In chemometric, it is usual to measure certain parameters in terms of a set of spectrometric curves that are observed in a finite set of points (functional data). Although the predictor variable is clearly functional, this problem is usually solved by using multivariate calibration techniques that consider it as a finite set of variables associated with the observed points (wavelengths or times). But these explicative variables are highly correlated and it is therefore more informative to reconstruct first the true functional form of the predictor curves. Although it has been published in several articles related to the implementation of functional data analysis techniques in chemometric, their power to solve real problems is not yet well known. Because of this the extension of multivariate calibration techniques (linear regression, principal component regression and partial least squares) and classification methods (linear discriminant analysis and logistic regression) to the functional domain and some relevant chemometric applications are reviewed in this paper.展开更多
Traumatic brain injury(TBI) at a young age can lead to the development of long-term functional impairments. Severity of injury is well demonstrated to have a strong influence on the extent of functional impairments;ho...Traumatic brain injury(TBI) at a young age can lead to the development of long-term functional impairments. Severity of injury is well demonstrated to have a strong influence on the extent of functional impairments;however, identification of specific magnetic resonance imaging(MRI) biomarkers that are most reflective of injury severity and functional prognosis remain elusive. Therefore, the objective of this study was to utilize advanced statistical approaches to identify clinically relevant MRI biomarkers and predict functional outcomes using MRI metrics in a translational large animal piglet TBI model. TBI was induced via controlled cortical impact and multiparametric MRI was performed at 24 hours and 12 weeks post-TBI using T1-weighted, T2-weighted, T2-weighted fluid attenuated inversion recovery, diffusion-weighted imaging, and diffusion tensor imaging. Changes in spatiotemporal gait parameters were also assessed using an automated gait mat at 24 hours and 12 weeks post-TBI. Principal component analysis was performed to determine the MRI metrics and spatiotemporal gait parameters that explain the largest sources of variation within the datasets. We found that linear combinations of lesion size and midline shift acquired using T2-weighted imaging explained most of the variability of the data at both 24 hours and 12 weeks post-TBI. In addition, linear combinations of velocity, cadence, and stride length were found to explain most of the gait data variability at 24 hours and 12 weeks post-TBI. Linear regression analysis was performed to determine if MRI metrics are predictive of changes in gait. We found that both lesion size and midline shift are significantly correlated with decreases in stride and step length. These results from this study provide an important first step at identifying relevant MRI and functional biomarkers that are predictive of functional outcomes in a clinically relevant piglet TBI model. This study was approved by the University of Georgia Institutional Animal Care and Use Committee(AUP: A2015 11-001) on December 22, 2015.展开更多
We consider a functional partially linear additive model that predicts a functional response by a scalar predictor and functional predictors. The B-spline and eigenbasis least squares estimator for both the parametric...We consider a functional partially linear additive model that predicts a functional response by a scalar predictor and functional predictors. The B-spline and eigenbasis least squares estimator for both the parametric and the nonparametric components proposed. In the final of this paper, as a result, we got the variance decomposition of the model and establish the asymptotic convergence rate for estimator.展开更多
为解决传统多元线性回归(Multivariate linear regression,MLR)模型在煤炭发热量预测方面精度不足和适用性有限的问题,提出了一种基于改进自适应增强算法(Adaptive boosting,Adaboost)的煤发热量的预测模型。将随机森林(Random forest,...为解决传统多元线性回归(Multivariate linear regression,MLR)模型在煤炭发热量预测方面精度不足和适用性有限的问题,提出了一种基于改进自适应增强算法(Adaptive boosting,Adaboost)的煤发热量的预测模型。将随机森林(Random forest,RF)作为Adaboost的基学习器,以提高模型在工业煤质分析中的发热量预测精度和泛化能力。研究基于某电厂1万组入炉煤的工业分析数据,选取水分、挥发分、灰分和固定碳作为模型输入,建立煤炭低位发热量的预测模型。通过与传统的多元线性回归方程及其他非线性模型比较,模型展现出更高的预测精度和更好的泛化能力。大样本测试的实验结果表明,本模型的平均绝对百分比误差为0.5417%,均方根误差为0.1304 MJ/kg,拟合度(R^(2))达到0.9799,其在煤炭发热量预测方面优于其他模型。此外,200组真实的混煤工业分析数据的模拟验证,进一步确认了本模型较优的泛化性能。展开更多
基金founded by the National Natural Science Foundation of China(81202283,81473070,81373102 and81202267)Key Grant of Natural Science Foundation of the Jiangsu Higher Education Institutions of China(10KJA330034 and11KJA330001)+1 种基金the Research Fund for the Doctoral Program of Higher Education of China(20113234110002)the Priority Academic Program for the Development of Jiangsu Higher Education Institutions(Public Health and Preventive Medicine)
文摘With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.
文摘Spatio-temporal assessment of the above ground biomass (AGB) is a cumbersome task due to the difficulties associated with the measurement of different tree parameters such as girth at breast height and height of trees. The present research was conducted in the campus of Birla Institute of Technology, Mesra, Ranchi, India, which is predomi- nantly covered by Sal (Shorea robusta C. F. Gaertn). Two methods of regression analysis was employed to determine the potential of remote sensing parameters with the AGB measured in the field such as linear regression analysis between the AGB and the individual bands, principal components (PCs) of the bands, vegetation indices (VI), and the PCs of the VIs respectively and multiple linear regression (MLR) analysis be- tween the AGB and all the variables in each category of data. From the linear regression analysis, it was found that only the NDVI exhibited regression coefficient value above 0.80 with the remaining parameters showing very low values. On the other hand, the MLR based analysis revealed significantly improved results as evidenced by the occurrence of very high correlation coefficient values of greater than 0.90 determined between the computed AGB from the MLR equations and field-estimated AGB thereby ascertaining their superiority in providing reliable estimates of AGB. The highest correlation coefficient of 0.99 is found with the MLR involving PCs of VIs.
基金Under the auspices of National Natural Science Foundation of China(No.41371008,41101119)New Start Academic Research Projects of Beijing Union University(No.ZK201201)
文摘The key to studying urban sustainable development depends on quantifying stores, efficiencies of urban metabolisms and capturing urban metabolisms′ mechanisms. This paper builds up the metabolic emergy account and quantifies some important concepts of emergy stores. Emphasis is placed on the urban metabolic model based on the slack based model(SBM) method to measure urban metabolic efficiencies. Urban metabolic mechanisms are discussed by using the regression method. By integrating these models, this paper analyzes the urban metabolic development in Beijing from 2001 to 2010. We conclude that the metabolic emergy stores of Beijing increased significantly from 2001 to 2010, with the emergy imported accounting for most of the increase. The metabolic efficiencies in Beijing have improved since the 2008 Olympic Games. The population, economic growth, industrial structures, and environmental governance positively affect the overall urban metabolism, while the land expansion, urbanization and environmentally technical levels hinder the improving of urban metabolic efficiencies. The SBM metabolic method and the regression model based on the emergy analysis provide insights into the urban metabolic efficiencies and the mechanism. They can promote to integrate such concepts into their sustainability analyses and policy decisions.
文摘Glass is the precious material evidence of the trade of the early Silk Road. The ancient glass was easily affected by the environmental impact and weathering, and the change of composition ratios affected the correct judgment of its category. In this paper, mathematical models and methods such as Chi-square test, weighted average method, principal component analysis, cluster analysis, binary classification model and grey correlation analysis were used comprehensively to analyze the data of sample glass products combined with their categories. The results showed that the weathered high-potassium glass could be divided into 12, 9, 10 and 27, 7, 22 and so on.
文摘In this paper we aim to analyse temporal variation of CD4 cell counts for HIV-infected individuals under antiretroviral therapy by using statistical methods. This is achieved by resorting to recursive binary regression tree approach [1]?[2]. This approach has made it possible to highlight the existence of several segments of the population of interest described by the interactions between the predictive covariates of the response to the treatment regimen.
文摘The objective of this paper is to present a review of different calibration and classification methods for functional data in the context of chemometric applications. In chemometric, it is usual to measure certain parameters in terms of a set of spectrometric curves that are observed in a finite set of points (functional data). Although the predictor variable is clearly functional, this problem is usually solved by using multivariate calibration techniques that consider it as a finite set of variables associated with the observed points (wavelengths or times). But these explicative variables are highly correlated and it is therefore more informative to reconstruct first the true functional form of the predictor curves. Although it has been published in several articles related to the implementation of functional data analysis techniques in chemometric, their power to solve real problems is not yet well known. Because of this the extension of multivariate calibration techniques (linear regression, principal component regression and partial least squares) and classification methods (linear discriminant analysis and logistic regression) to the functional domain and some relevant chemometric applications are reviewed in this paper.
基金Financial support was provided by the University of Georgia Office of the Vice President for Research to FDW。
文摘Traumatic brain injury(TBI) at a young age can lead to the development of long-term functional impairments. Severity of injury is well demonstrated to have a strong influence on the extent of functional impairments;however, identification of specific magnetic resonance imaging(MRI) biomarkers that are most reflective of injury severity and functional prognosis remain elusive. Therefore, the objective of this study was to utilize advanced statistical approaches to identify clinically relevant MRI biomarkers and predict functional outcomes using MRI metrics in a translational large animal piglet TBI model. TBI was induced via controlled cortical impact and multiparametric MRI was performed at 24 hours and 12 weeks post-TBI using T1-weighted, T2-weighted, T2-weighted fluid attenuated inversion recovery, diffusion-weighted imaging, and diffusion tensor imaging. Changes in spatiotemporal gait parameters were also assessed using an automated gait mat at 24 hours and 12 weeks post-TBI. Principal component analysis was performed to determine the MRI metrics and spatiotemporal gait parameters that explain the largest sources of variation within the datasets. We found that linear combinations of lesion size and midline shift acquired using T2-weighted imaging explained most of the variability of the data at both 24 hours and 12 weeks post-TBI. In addition, linear combinations of velocity, cadence, and stride length were found to explain most of the gait data variability at 24 hours and 12 weeks post-TBI. Linear regression analysis was performed to determine if MRI metrics are predictive of changes in gait. We found that both lesion size and midline shift are significantly correlated with decreases in stride and step length. These results from this study provide an important first step at identifying relevant MRI and functional biomarkers that are predictive of functional outcomes in a clinically relevant piglet TBI model. This study was approved by the University of Georgia Institutional Animal Care and Use Committee(AUP: A2015 11-001) on December 22, 2015.
文摘We consider a functional partially linear additive model that predicts a functional response by a scalar predictor and functional predictors. The B-spline and eigenbasis least squares estimator for both the parametric and the nonparametric components proposed. In the final of this paper, as a result, we got the variance decomposition of the model and establish the asymptotic convergence rate for estimator.