Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input da...Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.展开更多
Panicle swarm optimization (PSO) is an optimization algorithm based on the swarm intelligent principle. In this paper the modified PSO is applied to a kernel principal component analysis ( KPCA ) for an optimal ke...Panicle swarm optimization (PSO) is an optimization algorithm based on the swarm intelligent principle. In this paper the modified PSO is applied to a kernel principal component analysis ( KPCA ) for an optimal kernel function parameter. We first comprehensively considered within-class scatter and between-class scatter of the sample features. Then, the fitness function of an optimized kernel function parameter is constructed, and the particle swarm optimization algorithm with adaptive acceleration (CPSO) is applied to optimizing it. It is used for gearbox condi- tion recognition, and the result is compared with the recognized results based on principal component analysis (PCA). The results show that KPCA optimized by CPSO can effectively recognize fault conditions of the gearbox by reducing bind set-up of the kernel function parameter, and its results of fault recognition outperform those of PCA. We draw the conclusion that KPCA based on CPSO has an advantage in nonlinear feature extraction of mechanical failure, and is helpful for fault condition recognition of complicated machines.展开更多
Many advanced mathematical models of biochemical, biophysical and other processes in systems biology can be described by parametrized systems of nonlinear differential equations. Due to complexity of the models, a pro...Many advanced mathematical models of biochemical, biophysical and other processes in systems biology can be described by parametrized systems of nonlinear differential equations. Due to complexity of the models, a problem of their simplification has become of great importance. In particular, rather challengeable methods of estimation of parameters in these models may require such simplifications. The paper offers a practical way of constructing approximations of nonlinearly parametrized functions by linearly parametrized ones. As the idea of such approximations goes back to Principal Component Analysis, we call the corresponding transformation Principal Component Transform. We show that this transform possesses the best individual fit property, in the sense that the corresponding approximations preserve most information (in some sense) about the original function. It is also demonstrated how one can estimate the error between the given function and its approximations. In addition, we apply the theory of tensor products of compact operators in Hilbert spaces to justify our method for the case of the products of parametrized functions. Finally, we provide several examples, which are of relevance for systems biology.展开更多
In this study, two functional logistic regression models with functional principal component basis (FPCA) and functional partial least squares basis (FPLS) have been developed to distinguish precancerous adenomatous p...In this study, two functional logistic regression models with functional principal component basis (FPCA) and functional partial least squares basis (FPLS) have been developed to distinguish precancerous adenomatous polyps from hyperplastic polyps for the purpose of classification and interpretation. The classification performances of the two functional models have been compared with two widely used multivariate methods, principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA). The results indicated that classification abilities of FPCA and FPLS models outperformed those of the PCDA and PLSDA models by using a small number of functional basis components. With substantial reduction in model complexity and improvement of classification accuracy, it is particularly helpful for interpretation of the complex spectral features related to precancerous colon polyps.展开更多
The objective of this paper is to present a review of different calibration and classification methods for functional data in the context of chemometric applications. In chemometric, it is usual to measure certain par...The objective of this paper is to present a review of different calibration and classification methods for functional data in the context of chemometric applications. In chemometric, it is usual to measure certain parameters in terms of a set of spectrometric curves that are observed in a finite set of points (functional data). Although the predictor variable is clearly functional, this problem is usually solved by using multivariate calibration techniques that consider it as a finite set of variables associated with the observed points (wavelengths or times). But these explicative variables are highly correlated and it is therefore more informative to reconstruct first the true functional form of the predictor curves. Although it has been published in several articles related to the implementation of functional data analysis techniques in chemometric, their power to solve real problems is not yet well known. Because of this the extension of multivariate calibration techniques (linear regression, principal component regression and partial least squares) and classification methods (linear discriminant analysis and logistic regression) to the functional domain and some relevant chemometric applications are reviewed in this paper.展开更多
We use the functional principal component analysis(FPCA) to model and predict the weight growth in children.In particular,we examine how the approach can help discern growth patterns of underweight children relative t...We use the functional principal component analysis(FPCA) to model and predict the weight growth in children.In particular,we examine how the approach can help discern growth patterns of underweight children relative to their normal counterparts,and whether a commonly used transformation to normality plays any constructive roles in a predictive model based on the FPCA.Our work supplements the conditional growth charts developed by Wei and He(2006) by constructing a predictive growth model based on a small number of principal components scores on individual's past.展开更多
Traumatic brain injury(TBI) at a young age can lead to the development of long-term functional impairments. Severity of injury is well demonstrated to have a strong influence on the extent of functional impairments;ho...Traumatic brain injury(TBI) at a young age can lead to the development of long-term functional impairments. Severity of injury is well demonstrated to have a strong influence on the extent of functional impairments;however, identification of specific magnetic resonance imaging(MRI) biomarkers that are most reflective of injury severity and functional prognosis remain elusive. Therefore, the objective of this study was to utilize advanced statistical approaches to identify clinically relevant MRI biomarkers and predict functional outcomes using MRI metrics in a translational large animal piglet TBI model. TBI was induced via controlled cortical impact and multiparametric MRI was performed at 24 hours and 12 weeks post-TBI using T1-weighted, T2-weighted, T2-weighted fluid attenuated inversion recovery, diffusion-weighted imaging, and diffusion tensor imaging. Changes in spatiotemporal gait parameters were also assessed using an automated gait mat at 24 hours and 12 weeks post-TBI. Principal component analysis was performed to determine the MRI metrics and spatiotemporal gait parameters that explain the largest sources of variation within the datasets. We found that linear combinations of lesion size and midline shift acquired using T2-weighted imaging explained most of the variability of the data at both 24 hours and 12 weeks post-TBI. In addition, linear combinations of velocity, cadence, and stride length were found to explain most of the gait data variability at 24 hours and 12 weeks post-TBI. Linear regression analysis was performed to determine if MRI metrics are predictive of changes in gait. We found that both lesion size and midline shift are significantly correlated with decreases in stride and step length. These results from this study provide an important first step at identifying relevant MRI and functional biomarkers that are predictive of functional outcomes in a clinically relevant piglet TBI model. This study was approved by the University of Georgia Institutional Animal Care and Use Committee(AUP: A2015 11-001) on December 22, 2015.展开更多
Lipopeptides are currently re-emerging as an interesting subgroup in the peptide research field, having historical applications as antibacterial and antifungal agents and new potential applications as antiviral, antit...Lipopeptides are currently re-emerging as an interesting subgroup in the peptide research field, having historical applications as antibacterial and antifungal agents and new potential applications as antiviral, antitumor, immune-modulating and cell-penetrating compounds. However, due to their specific structure, chromatographic analysis often requires special buffer systems or the use of trifluoroacetic acid, limiting mass spectrometry detection. Therefore, we used a traditional aqueous/acetonitrile based gradient system, containing 0.1% (m/v) formic acid, to separate four pharmaceutically relevant lipopeptides (polymyxin B1, caspofungin, daptomycin and gramicidin A1), which were selected based upon hierarchical cluster analysis (HCA) and principal component analysis (PCA).In total, the performance of four different C18 columns, including one UPLC column, were evaluated using two parallel approaches. First, a Derringer desirability function was used, whereby six single and multiple chromatographic response values were rescaled into one overall D-value per column. Using this approach, the YMC Pack Pro C18 column was ranked as the best column for general MS-compatible lipopeptide separation. Secondly, the kinetic plot approach was used to compare the different columns at different flow rate ranges. As the optimal kinetic column performance is obtained at its maximal pressure, the length elongation factor λ(Pmax/Pexp) was used to transform the obtained experimental data (retention times and peak capacities) and construct kinetic performance limit (KPL) curves, allowing a direct visual and unbiased comparison of the selected columns, whereby the YMC Triart C18 UPLC and ACE C18 columns performed as best. Finally, differences in column performance and the (dis)advantages of both approaches are discussed.展开更多
[Objective] This study aimed to investigation the effects of tranagenic Bt + CpTI cotton cultivation on functional diversity of microbial communities in rhizospbere soils. E Method] By using the Biolog method, a comp...[Objective] This study aimed to investigation the effects of tranagenic Bt + CpTI cotton cultivation on functional diversity of microbial communities in rhizospbere soils. E Method] By using the Biolog method, a comparative study was conducted on the utilization level of single carbon source by microbes in the rhi- zosphere soils of transgenic Bt + CpTI cotton sGK321 and its parental conventional cotton ' Shiyuan 321' at different growth stages. [ Result ] The results showed that, compared with the parental conventional cotton, the average well-color development (AWCD) value of micmhial communities in rhizospbere soils of transgenie Bt + CpTI cotton were significantly higher (P 〈 O. 05) at seedling stage and budding stage while significantly lower at flower and boll stage and bell opening stage. Shannon-Wiener diversity index (H) and Simpson dominance index (D) of microbial communities in rhlzesphere soils of transgenic cotton and conventional cotton varied with the different growth stages, whereas the Shannon-Wiener evenness index (E) showed no significant difference between transgenie cotton and convention- al cotton at four growth stages. Principal component analysis indicated that the patterns of carbon source utilization by microbial communities in rhizospbere soils were similar among transgenic cotton at seeding stage and flower and boll stage and parental conventional cotton at seeding stage and budding stage, which were also similar between tranagenic cotton at budding stage and parental conventional cotton at flower and boll stage. [ Conclusion] Analysis of different carbon sources indi- cated that the main carbon sources utilized by soil microbes were carbohydrates, amino acids, carboxylie acids and polymers.展开更多
We consider a functional partially linear additive model that predicts a functional response by a scalar predictor and functional predictors. The B-spline and eigenbasis least squares estimator for both the parametric...We consider a functional partially linear additive model that predicts a functional response by a scalar predictor and functional predictors. The B-spline and eigenbasis least squares estimator for both the parametric and the nonparametric components proposed. In the final of this paper, as a result, we got the variance decomposition of the model and establish the asymptotic convergence rate for estimator.展开更多
In recent years, functional data has been widely used in finance, medicine, biology and other fields. The current clustering analysis can solve the problems in finite-dimensional space, but it is difficult to be direc...In recent years, functional data has been widely used in finance, medicine, biology and other fields. The current clustering analysis can solve the problems in finite-dimensional space, but it is difficult to be directly used for the clustering of functional data. In this paper, we propose a new unsupervised clustering algorithm based on adaptive weights. In the absence of initialization parameter, we use entropy-type penalty terms and fuzzy partition matrix to find the optimal number of clusters. At the same time, we introduce a measure based on adaptive weights to reflect the difference in information content between different clustering metrics. Simulation experiments show that the proposed algorithm has higher purity than some algorithms.展开更多
Functional polymer microspheres have broad application prospects in various fields,such as metal ion detection,adsorption,separation,and controlled drug release.However,integrating different functions in a single micr...Functional polymer microspheres have broad application prospects in various fields,such as metal ion detection,adsorption,separation,and controlled drug release.However,integrating different functions in a single microsphere system is a significant challenge in this field.In this work,we prepared multicompartmental emulsion droplets utilizing microfluidic technology.Fe3O4 magnetic nanoparticles were added to one of the compartments of the emulsion droplets as functional particles,and Janus microspheres were obtained after curing.Fluorescent probes enter the two compartments of the Janus microspheres by diffusion.The fluorescence changes of the microspheres were observed in situ and captured through a fluorescence microscope.The images are processed by image recognition software and a Python program.The“fingerprint”of the detected metal ions is obtained by dimensionality reduction of the data through Principal Component Analysis.We employ different algorithms to build Machine Learning models for predicting the metal ion species and concentration.The variation of fluorescence intensity of the three fluorescent probes and the corresponding R,G,and B channel values and time are used as descriptors.The results show that the Random Forest,K-neighborhood(KNN),and Neural Network models demonstrated a better predicted effect with a variance(R2)greater than 0.9 and a smaller root mean square error;among them,the KNN model predicted the most accurate results.展开更多
文摘Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.
基金supported by National Natural Science Foundation under Grant No.50875247Shanxi Province Natural Science Foundation under Grant No.2009011026-1
文摘Panicle swarm optimization (PSO) is an optimization algorithm based on the swarm intelligent principle. In this paper the modified PSO is applied to a kernel principal component analysis ( KPCA ) for an optimal kernel function parameter. We first comprehensively considered within-class scatter and between-class scatter of the sample features. Then, the fitness function of an optimized kernel function parameter is constructed, and the particle swarm optimization algorithm with adaptive acceleration (CPSO) is applied to optimizing it. It is used for gearbox condi- tion recognition, and the result is compared with the recognized results based on principal component analysis (PCA). The results show that KPCA optimized by CPSO can effectively recognize fault conditions of the gearbox by reducing bind set-up of the kernel function parameter, and its results of fault recognition outperform those of PCA. We draw the conclusion that KPCA based on CPSO has an advantage in nonlinear feature extraction of mechanical failure, and is helpful for fault condition recognition of complicated machines.
文摘Many advanced mathematical models of biochemical, biophysical and other processes in systems biology can be described by parametrized systems of nonlinear differential equations. Due to complexity of the models, a problem of their simplification has become of great importance. In particular, rather challengeable methods of estimation of parameters in these models may require such simplifications. The paper offers a practical way of constructing approximations of nonlinearly parametrized functions by linearly parametrized ones. As the idea of such approximations goes back to Principal Component Analysis, we call the corresponding transformation Principal Component Transform. We show that this transform possesses the best individual fit property, in the sense that the corresponding approximations preserve most information (in some sense) about the original function. It is also demonstrated how one can estimate the error between the given function and its approximations. In addition, we apply the theory of tensor products of compact operators in Hilbert spaces to justify our method for the case of the products of parametrized functions. Finally, we provide several examples, which are of relevance for systems biology.
文摘In this study, two functional logistic regression models with functional principal component basis (FPCA) and functional partial least squares basis (FPLS) have been developed to distinguish precancerous adenomatous polyps from hyperplastic polyps for the purpose of classification and interpretation. The classification performances of the two functional models have been compared with two widely used multivariate methods, principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA). The results indicated that classification abilities of FPCA and FPLS models outperformed those of the PCDA and PLSDA models by using a small number of functional basis components. With substantial reduction in model complexity and improvement of classification accuracy, it is particularly helpful for interpretation of the complex spectral features related to precancerous colon polyps.
文摘The objective of this paper is to present a review of different calibration and classification methods for functional data in the context of chemometric applications. In chemometric, it is usual to measure certain parameters in terms of a set of spectrometric curves that are observed in a finite set of points (functional data). Although the predictor variable is clearly functional, this problem is usually solved by using multivariate calibration techniques that consider it as a finite set of variables associated with the observed points (wavelengths or times). But these explicative variables are highly correlated and it is therefore more informative to reconstruct first the true functional form of the predictor curves. Although it has been published in several articles related to the implementation of functional data analysis techniques in chemometric, their power to solve real problems is not yet well known. Because of this the extension of multivariate calibration techniques (linear regression, principal component regression and partial least squares) and classification methods (linear discriminant analysis and logistic regression) to the functional domain and some relevant chemometric applications are reviewed in this paper.
基金supported by National Natural Science Foundation of China (Grant No. 10828102)a Changjiang Visiting Professorship, the Training Fund of Northeast Normal University’s Scientific Innovation Project (Grant No. NENU-STC07002)the National Institutes of Health Grant of USA (Grant No. R01GM080503-01A1)
文摘We use the functional principal component analysis(FPCA) to model and predict the weight growth in children.In particular,we examine how the approach can help discern growth patterns of underweight children relative to their normal counterparts,and whether a commonly used transformation to normality plays any constructive roles in a predictive model based on the FPCA.Our work supplements the conditional growth charts developed by Wei and He(2006) by constructing a predictive growth model based on a small number of principal components scores on individual's past.
基金Financial support was provided by the University of Georgia Office of the Vice President for Research to FDW。
文摘Traumatic brain injury(TBI) at a young age can lead to the development of long-term functional impairments. Severity of injury is well demonstrated to have a strong influence on the extent of functional impairments;however, identification of specific magnetic resonance imaging(MRI) biomarkers that are most reflective of injury severity and functional prognosis remain elusive. Therefore, the objective of this study was to utilize advanced statistical approaches to identify clinically relevant MRI biomarkers and predict functional outcomes using MRI metrics in a translational large animal piglet TBI model. TBI was induced via controlled cortical impact and multiparametric MRI was performed at 24 hours and 12 weeks post-TBI using T1-weighted, T2-weighted, T2-weighted fluid attenuated inversion recovery, diffusion-weighted imaging, and diffusion tensor imaging. Changes in spatiotemporal gait parameters were also assessed using an automated gait mat at 24 hours and 12 weeks post-TBI. Principal component analysis was performed to determine the MRI metrics and spatiotemporal gait parameters that explain the largest sources of variation within the datasets. We found that linear combinations of lesion size and midline shift acquired using T2-weighted imaging explained most of the variability of the data at both 24 hours and 12 weeks post-TBI. In addition, linear combinations of velocity, cadence, and stride length were found to explain most of the gait data variability at 24 hours and 12 weeks post-TBI. Linear regression analysis was performed to determine if MRI metrics are predictive of changes in gait. We found that both lesion size and midline shift are significantly correlated with decreases in stride and step length. These results from this study provide an important first step at identifying relevant MRI and functional biomarkers that are predictive of functional outcomes in a clinically relevant piglet TBI model. This study was approved by the University of Georgia Institutional Animal Care and Use Committee(AUP: A2015 11-001) on December 22, 2015.
基金funded by PhD grants of ‘Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT-Vlaanderen)’ (Nos. 101529 (MD) and 121512 (BG))The Special Research Fund (BOF) of Ghent University (01J22510 (EW) and 01D38811 (SS))
文摘Lipopeptides are currently re-emerging as an interesting subgroup in the peptide research field, having historical applications as antibacterial and antifungal agents and new potential applications as antiviral, antitumor, immune-modulating and cell-penetrating compounds. However, due to their specific structure, chromatographic analysis often requires special buffer systems or the use of trifluoroacetic acid, limiting mass spectrometry detection. Therefore, we used a traditional aqueous/acetonitrile based gradient system, containing 0.1% (m/v) formic acid, to separate four pharmaceutically relevant lipopeptides (polymyxin B1, caspofungin, daptomycin and gramicidin A1), which were selected based upon hierarchical cluster analysis (HCA) and principal component analysis (PCA).In total, the performance of four different C18 columns, including one UPLC column, were evaluated using two parallel approaches. First, a Derringer desirability function was used, whereby six single and multiple chromatographic response values were rescaled into one overall D-value per column. Using this approach, the YMC Pack Pro C18 column was ranked as the best column for general MS-compatible lipopeptide separation. Secondly, the kinetic plot approach was used to compare the different columns at different flow rate ranges. As the optimal kinetic column performance is obtained at its maximal pressure, the length elongation factor λ(Pmax/Pexp) was used to transform the obtained experimental data (retention times and peak capacities) and construct kinetic performance limit (KPL) curves, allowing a direct visual and unbiased comparison of the selected columns, whereby the YMC Triart C18 UPLC and ACE C18 columns performed as best. Finally, differences in column performance and the (dis)advantages of both approaches are discussed.
基金Supported by Major Project for Breeding and Cultivation of Novel GM Varieties(2011ZX08012-005,2011ZX08011-002)Dean Fund of Chinese Academy of Agricultural Sciences(201020)
文摘[Objective] This study aimed to investigation the effects of tranagenic Bt + CpTI cotton cultivation on functional diversity of microbial communities in rhizospbere soils. E Method] By using the Biolog method, a comparative study was conducted on the utilization level of single carbon source by microbes in the rhi- zosphere soils of transgenic Bt + CpTI cotton sGK321 and its parental conventional cotton ' Shiyuan 321' at different growth stages. [ Result ] The results showed that, compared with the parental conventional cotton, the average well-color development (AWCD) value of micmhial communities in rhizospbere soils of transgenie Bt + CpTI cotton were significantly higher (P 〈 O. 05) at seedling stage and budding stage while significantly lower at flower and boll stage and bell opening stage. Shannon-Wiener diversity index (H) and Simpson dominance index (D) of microbial communities in rhlzesphere soils of transgenic cotton and conventional cotton varied with the different growth stages, whereas the Shannon-Wiener evenness index (E) showed no significant difference between transgenie cotton and convention- al cotton at four growth stages. Principal component analysis indicated that the patterns of carbon source utilization by microbial communities in rhizospbere soils were similar among transgenic cotton at seeding stage and flower and boll stage and parental conventional cotton at seeding stage and budding stage, which were also similar between tranagenic cotton at budding stage and parental conventional cotton at flower and boll stage. [ Conclusion] Analysis of different carbon sources indi- cated that the main carbon sources utilized by soil microbes were carbohydrates, amino acids, carboxylie acids and polymers.
文摘We consider a functional partially linear additive model that predicts a functional response by a scalar predictor and functional predictors. The B-spline and eigenbasis least squares estimator for both the parametric and the nonparametric components proposed. In the final of this paper, as a result, we got the variance decomposition of the model and establish the asymptotic convergence rate for estimator.
文摘In recent years, functional data has been widely used in finance, medicine, biology and other fields. The current clustering analysis can solve the problems in finite-dimensional space, but it is difficult to be directly used for the clustering of functional data. In this paper, we propose a new unsupervised clustering algorithm based on adaptive weights. In the absence of initialization parameter, we use entropy-type penalty terms and fuzzy partition matrix to find the optimal number of clusters. At the same time, we introduce a measure based on adaptive weights to reflect the difference in information content between different clustering metrics. Simulation experiments show that the proposed algorithm has higher purity than some algorithms.
基金the National Natural Science Foundation of China(No.22272017)the Fundamental Research Funds for the Central Universities(No.DUT22LAB607,DUT22QN226,DUT22RC(3)036)Dalian High-Level Talent Innovation Program(No.2022RQ005).
文摘Functional polymer microspheres have broad application prospects in various fields,such as metal ion detection,adsorption,separation,and controlled drug release.However,integrating different functions in a single microsphere system is a significant challenge in this field.In this work,we prepared multicompartmental emulsion droplets utilizing microfluidic technology.Fe3O4 magnetic nanoparticles were added to one of the compartments of the emulsion droplets as functional particles,and Janus microspheres were obtained after curing.Fluorescent probes enter the two compartments of the Janus microspheres by diffusion.The fluorescence changes of the microspheres were observed in situ and captured through a fluorescence microscope.The images are processed by image recognition software and a Python program.The“fingerprint”of the detected metal ions is obtained by dimensionality reduction of the data through Principal Component Analysis.We employ different algorithms to build Machine Learning models for predicting the metal ion species and concentration.The variation of fluorescence intensity of the three fluorescent probes and the corresponding R,G,and B channel values and time are used as descriptors.The results show that the Random Forest,K-neighborhood(KNN),and Neural Network models demonstrated a better predicted effect with a variance(R2)greater than 0.9 and a smaller root mean square error;among them,the KNN model predicted the most accurate results.