The Heilongjiang Jianbiannongchang area is located at the confluence of the Great and Lesser Xing’an Ranges.This area has a complex magmatic and tectonic evolutionary history that has resulted in a complex and divers...The Heilongjiang Jianbiannongchang area is located at the confluence of the Great and Lesser Xing’an Ranges.This area has a complex magmatic and tectonic evolutionary history that has resulted in a complex and diverse geological background for mineralization.In this study,isometric logarithmic ratio(ILR)transformations of Au,Cu,Pb,Zn,and Sb contents were performed in the1:50,000 soil geochemical data of the Jianbiannongchang area.Robust principal component analysis(RPCA)was conducted based on ILR transformation.The local singularity and spectrum-area(S-A)methods were used to extract information on mineralogic anomalies.The results showed that:(1)the transformed data eliminated the influence of the original data closure effect,and the PC1and PC2 information obtained by applying RPCA reflected ore-producing element anomalies dominated by Au and Cu.(2)The local singularity method can enhance the information of the local strong and weak slow anomalies.After performing local singularity analysis on PC1 and PC2,the obtained local anomalies reflected the local singularity spatial anomaly patterns related to Cu and Au mineralization in this area,which is an effective method for trapping ore-producing anomalies.(3)Furthermore,the composite anomaly decomposition of PC1 and PC2 was performed using the S-A method,and the screened anomalous and background fields reflect the ore-producing anomalies related to Cu and Au mineralization.This information is in agreement with known Cu and Au mineralization.(4)The geochemical anomalies with mineralization potential were obtained outside the known mineralization sites by integrating the information of oreproducing anomalies extracted by the local singularity and S-A methods,providing the theoretical basis and exploration direction for future exploration in the study area.展开更多
Discrete element method(DEM)has been widely utilised to model the mechanical behaviours of granular materials.However,with simplified particle morphology or rheology-based rolling resistance models,DEM failed to descr...Discrete element method(DEM)has been widely utilised to model the mechanical behaviours of granular materials.However,with simplified particle morphology or rheology-based rolling resistance models,DEM failed to describe some responses,such as the particle kinematics at the grain-scale and the principal stress ratio against axial strain at the macro-scale.This paper adopts a computed tomography(CT)-based DEM technique,including particle morphology data acquisition from micro-CT(mCT),spherical harmonic-based principal component analysis(SH-PCA)-based particle morphology reconstruction and DEM simulations,to investigate the capability of DEM with realistic particle morphology for modelling granular soils’micro-macro mechanical responses with a consideration of the initial packing state,the morphological gene mutation degree,and the confining stress condition.It is found that DEM with realistic particle morphology can reasonably reproduce granular materials’micro-macro mechanical behaviours,including the deviatoric stressevolumetric straineaxial strain response,critical state behaviour,particle kinematics,and shear band evolution.Meanwhile,the role of multiscale particle morphology in granular soils depends on the initial packing state and the confining stress condition.For the same granular soils,rougher particle surfaces with a denser initial packing state and a higher confining stress condition result in a higher degree of shear strain localisation.展开更多
This study aimed to explore the application of surface-enhanced Raman scattering(SERS)in the rapid diagnosis of gastric cancer.The SERS spectra of 68 serum samples from gastric cancer patients and healthy volunteers w...This study aimed to explore the application of surface-enhanced Raman scattering(SERS)in the rapid diagnosis of gastric cancer.The SERS spectra of 68 serum samples from gastric cancer patients and healthy volunteers were acquired.The characteristic ratio method(CRM)and principal component analysis(PCA)were used to differentiate gastric cancer serum from normal serum.Compared with healthy volunteers,the serum SERS intensity of gastric cancer patients was relatively high at 722 cm^(-1),while it was relatively low at 588,644,861,1008,1235,1397,1445 and 1586 cm^(-1).These results indicated that the relative content of nucleic acids in the serum of gastric cancer patients rises while the relative content of amino acids and carbohydrates decreases.In PCA,the sensitivity and specificity of discriminating gastric cancer were 94.1%and 94.1%,respectively,with the accuracy of 94.1%.Based on the intensity ratios of four characteristic peaks at 722,861,1008 and 1397 cm^(-1),CRM presented the diagnostic sensitivity and specificity of 100%and 97.4%,respectively,and the accuracy of 98.5%.Therefore,the three peak intensity ratios of I_(722)/I_(861),I_(722)/I_(1008)and I_(722)/I_(1397)can be considered as biologicalfingerprint information for gastric cancer diagnosis and can rapidly and directly reflect the physiological and pathological changes associated with gastric cancer development.This study provides an important basis and standards for the early diagnosis of gastric cancer.展开更多
Recovering the low-rank structure of data matrix from sparse errors arises in the principal component pursuit (PCP). This paper exploits the higher-order generalization of matrix recovery, named higher-order princip...Recovering the low-rank structure of data matrix from sparse errors arises in the principal component pursuit (PCP). This paper exploits the higher-order generalization of matrix recovery, named higher-order principal component pursuit (HOPCP), since it is critical in multi-way data analysis. Unlike the convexification (nuclear norm) for matrix rank function, the tensorial nuclear norm is stil an open problem. While existing preliminary works on the tensor completion field provide a viable way to indicate the low complexity estimate of tensor, therefore, the paper focuses on the low multi-linear rank tensor and adopt its convex relaxation to formulate the convex optimization model of HOPCP. The paper further propose two algorithms for HOPCP based on alternative minimization scheme: the augmented Lagrangian alternating direction method (ALADM) and its truncated higher-order singular value decomposition (ALADM-THOSVD) version. The former can obtain a high accuracy solution while the latter is more efficient to handle the computationally intractable problems. Experimental results on both synthetic data and real magnetic resonance imaging data show the applicability of our algorithms in high-dimensional tensor data processing.展开更多
In this paper,25 sampling points of overlying deposits in Tonglushan mining area,Daye City,Hubei Province,China were tested for heavy metal content to explore pollution characteristics,pollution sources and ecological...In this paper,25 sampling points of overlying deposits in Tonglushan mining area,Daye City,Hubei Province,China were tested for heavy metal content to explore pollution characteristics,pollution sources and ecological risks of heavy metals in sediments.A geo-accumulation index method was used to evaluate the degree of heavy metal pollution in the sediment.The mean sediment quality guideline quotient was used for evaluating the ecological risk level of heavy metal in the sediment.And a method of correlation analysis,clustering analysis,and principal component analysis was used for preliminary analysis on the source of heavy metal in the sediment.It was indicated that there was extremely heavy metal pollution in the sediment,among which Cd was extremely polluted,Cu strongly contaminated,Zn,As,and Hg moderately contaminated,and Pb,Cr,and Ni were slightly contaminated.It was also indicated by the mean sediment quality guideline-quotient result that there was a high ecological risk of heavy metals in the sediment,and 64%of the sample sites had extremely high hidden biotoxic effects.For distribution,the contamination of branches was worse than that of the main channel of Daye Dagang,and the deposition of each heavy metal was mainly influenced by the distance from this sample site to the sewage draining exit of a tailings pond.The source analysis showed that the heavy metals in the sediment come from pollution discharging of mining and beneficiation companies,tailings ponds,smelting companies,and transport vehicles.In the study area,due to the influence of heavy metal discharging from these sources,the ecotoxicity of heavy metals in the sediment was extremely high,and Cd was the most toxic pollutant.The research figured out the key restoration area and elements for ecological restoration in the sediment of the Tonglüshan mining area,which could be referenced by monitoring and governance of heavy metal pollution in the sediment of the polymetallic mining area.展开更多
Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input da...Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.展开更多
In this paper, a unified matrix recovery model was proposed for diverse corrupted matrices. Resulting from the separable structure of the proposed model, the convex optimization problem can be solved efficiently by ad...In this paper, a unified matrix recovery model was proposed for diverse corrupted matrices. Resulting from the separable structure of the proposed model, the convex optimization problem can be solved efficiently by adopting an inexact augmented Lagrange multiplier (IALM) method. Additionally, a random projection accelerated technique (IALM+RP) was adopted to improve the success rate. From the preliminary numerical comparisons, it was indicated that for the standard robust principal component analysis (PCA) problem, IALM+RP was at least two to six times faster than IALM with an insignificant reduction in accuracy; and for the outlier pursuit (OP) problem, IALM+RP was at least 6.9 times faster, even up to 8.3 times faster when the size of matrix was 2 000×2 000.展开更多
In order to investigate the eutrophication degree of Yuqiao Reservoir, a hybrid method, combining principal component regression (PCR) and artificial neural network (ANN), was adopted to predict chlorophyll-a concentr...In order to investigate the eutrophication degree of Yuqiao Reservoir, a hybrid method, combining principal component regression (PCR) and artificial neural network (ANN), was adopted to predict chlorophyll-a concentration of Yuqiao Reservoir’s outflow. The data were obtained from two sampling sites, site 1 in the reservoir, and site 2 near the dam. Seven water variables, namely chlorophyll-a concentration of site 2 at time t and that of both sites 10 days before t, total phosphorus(TP), total nitrogen(TN), dissolved oxygen(DO), and temperature from January 2000 to September 2002, were utilized to develop models. To remove the collinearity between the variables, principal components extracted by principal component analysis were employed as predictors for models. The performance of models was assessed by the square of correlation coefficient, mean absolute error (MAE), root mean square error (RMSE) and average absolute relative error (AARE). Results show that the hybrid method has achieved more accurate prediction than PCR or ANN model. Finally, the three models were applied to predicting the chlorophyll-a concentration in 2003. The predictions of the hybrid method were found to be consistent with the observed values all year round, while the results of PCR and ANN models did not fit quite well from July to October.展开更多
How to fit a properly nonlinear classification model from conventional well logs to lithofacies is a key problem for machine learning methods.Kernel methods(e.g.,KFD,SVM,MSVM)are effective attempts to solve this issue...How to fit a properly nonlinear classification model from conventional well logs to lithofacies is a key problem for machine learning methods.Kernel methods(e.g.,KFD,SVM,MSVM)are effective attempts to solve this issue due to abilities of handling nonlinear features by kernel functions.Deep mining of log features indicating lithofacies still needs to be improved for kernel methods.Hence,this work employs deep neural networks to enhance the kernel principal component analysis(KPCA)method and proposes a deep kernel method(DKM)for lithofacies identification using well logs.DKM includes a feature extractor and a classifier.The feature extractor consists of a series of KPCA models arranged according to residual network structure.A gradient-free optimization method is introduced to automatically optimize parameters and structure in DKM,which can avoid complex tuning of parameters in models.To test the validation of the proposed DKM for lithofacies identification,an open-sourced dataset with seven con-ventional logs(GR,CAL,AC,DEN,CNL,LLD,and LLS)and lithofacies labels from the Daniudi Gas Field in China is used.There are eight lithofacies,namely clastic rocks(pebbly,coarse,medium,and fine sand-stone,siltstone,mudstone),coal,and carbonate rocks.The comparisons between DKM and three commonly used kernel methods(KFD,SVM,MSVM)show that(1)DKM(85.7%)outperforms SVM(77%),KFD(79.5%),and MSVM(82.8%)in accuracy of lithofacies identification;(2)DKM is about twice faster than the multi-kernel method(MSVM)with good accuracy.The blind well test in Well D13 indicates that compared with the other three methods DKM improves about 24%in accuracy,35%in precision,41%in recall,and 40%in F1 score,respectively.In general,DKM is an effective method for complex lithofacies identification.This work also discussed the optimal structure and classifier for DKM.Experimental re-sults show that(m_(1),m_(2),O)is the optimal model structure and linear svM is the optimal classifier.(m_(1),m_(2),O)means there are m KPCAs,and then m2 residual units.A workflow to determine an optimal classifier in DKM for lithofacies identification is proposed,too.展开更多
This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a c...This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a convex approximation of the rank operator under mild condition. However, most nuclear norm minimization approaches are based on SVD operations. Given a matrix , the time complexity of SVD operation is O(mn2), which brings prohibitive computational complexity in large-scale problems. In this paper, an efficient and scalable algorithm for tensor principal component analysis is proposed which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis (LADMVTPCA). Different from traditional matrix factorization methods, LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors, which greatly improves the computational efficacy compared to matrix factorization method. In the experiment part, synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA. Results have shown that LADMVTPCA outperforms matrix factorization based method.展开更多
In order to study the water quality of the Shichuan River basin in Fuping,Shaanxi Province,based on improved Nemerow index method,comprehensive pollution index method and principal component analysis method,eight wate...In order to study the water quality of the Shichuan River basin in Fuping,Shaanxi Province,based on improved Nemerow index method,comprehensive pollution index method and principal component analysis method,eight water quality indexes such as pH,dissolved oxygen(DO),total dissolved solids(TDS),COD,total hardness,total phosphorus,total nitrogen and Zn in three monitoring sections of Fuping section of the Shichuan River in Shaanxi Province were detected and analyzed.The results show that the water quality of the surface water in the Shichuan River basin is gradeⅢorⅣwater,that is,the water is slightly polluted and moderately polluted.It is necessary to monitor the water quality after regulation and clarify the main factors causing the water pollution.展开更多
Having researched for many years, seismologists in China presented about 80 earthquake prediction factors which reflected omen information of earthquake. How to concentrate the information that the 80 earthquake predi...Having researched for many years, seismologists in China presented about 80 earthquake prediction factors which reflected omen information of earthquake. How to concentrate the information that the 80 earthquake prediction factors have and how to choose the main factors to predict earthquakes precisely have become one of the topics in seismology. The model of principal component-discrimination consists of principal component analysis, correlation analysis, weighted method of principal factor coefficients and Mahalanobis distance discrimination analysis. This model combines the method of maximization earthquake prediction factor information with the weighted method of principal factor coefficients and correlation analysis to choose earthquake prediction variables, applying Mahalanobis distance discrimination to establishing earthquake prediction discrimination model. This model was applied to analyzing the earthquake data of Northern China area and obtained good prediction results.展开更多
This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverag...This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverage time, the frequency of coverage, average time per coverage and maximum coverage gap, etc. is required simultaneously. By introducing index normalization method to convert performance indices into dimensionless variables within the range of [0, 1], a design optimization method based on the principal component analysis and cluster analysis is proposed, which consists of index normalization method, principal component analysis, multiple-level cluster analysis and weighted evaluation method. The results of orbit optimization for earth observation satellites show that the optimal orbit can be obtained by using the proposed method. The principal component analysis can reduce the total number of indices with a non-independent relationship to save computing time. Similarly, the multiple-level cluster analysis with parallel computing could save computing time.展开更多
基金supported by the Project of the Natural Science Foundation of Liaoning Province(2020-BS-258)the Scientific Research Fund Project of the Educational Department of Liaoning Provincial(LJ2020JCL010)+1 种基金The project was supported by the discipline innovation team of Liaoning Technical University(LNTU20TD-14)the Key Research and Development Project of Heilongjiang Province(GA21A204).
文摘The Heilongjiang Jianbiannongchang area is located at the confluence of the Great and Lesser Xing’an Ranges.This area has a complex magmatic and tectonic evolutionary history that has resulted in a complex and diverse geological background for mineralization.In this study,isometric logarithmic ratio(ILR)transformations of Au,Cu,Pb,Zn,and Sb contents were performed in the1:50,000 soil geochemical data of the Jianbiannongchang area.Robust principal component analysis(RPCA)was conducted based on ILR transformation.The local singularity and spectrum-area(S-A)methods were used to extract information on mineralogic anomalies.The results showed that:(1)the transformed data eliminated the influence of the original data closure effect,and the PC1and PC2 information obtained by applying RPCA reflected ore-producing element anomalies dominated by Au and Cu.(2)The local singularity method can enhance the information of the local strong and weak slow anomalies.After performing local singularity analysis on PC1 and PC2,the obtained local anomalies reflected the local singularity spatial anomaly patterns related to Cu and Au mineralization in this area,which is an effective method for trapping ore-producing anomalies.(3)Furthermore,the composite anomaly decomposition of PC1 and PC2 was performed using the S-A method,and the screened anomalous and background fields reflect the ore-producing anomalies related to Cu and Au mineralization.This information is in agreement with known Cu and Au mineralization.(4)The geochemical anomalies with mineralization potential were obtained outside the known mineralization sites by integrating the information of oreproducing anomalies extracted by the local singularity and S-A methods,providing the theoretical basis and exploration direction for future exploration in the study area.
基金supported by the General Research Fund from the Research Grant Council of the Hong Kong SAR,China(Grant Nos.CityU 11201020 and CityU 11207321)the National Science Foundation of China(Grant No.42207185)+1 种基金the Contract Research Project from the Geotechnical Engineering Office of the Civil Engineering Development Department of Hong Kong SAR,China(Project Ref.No.CEDD STD-30-2030-1-12R)the BL13W beamline of Shanghai Synchrotron Radiation Facility(SSRF)。
文摘Discrete element method(DEM)has been widely utilised to model the mechanical behaviours of granular materials.However,with simplified particle morphology or rheology-based rolling resistance models,DEM failed to describe some responses,such as the particle kinematics at the grain-scale and the principal stress ratio against axial strain at the macro-scale.This paper adopts a computed tomography(CT)-based DEM technique,including particle morphology data acquisition from micro-CT(mCT),spherical harmonic-based principal component analysis(SH-PCA)-based particle morphology reconstruction and DEM simulations,to investigate the capability of DEM with realistic particle morphology for modelling granular soils’micro-macro mechanical responses with a consideration of the initial packing state,the morphological gene mutation degree,and the confining stress condition.It is found that DEM with realistic particle morphology can reasonably reproduce granular materials’micro-macro mechanical behaviours,including the deviatoric stressevolumetric straineaxial strain response,critical state behaviour,particle kinematics,and shear band evolution.Meanwhile,the role of multiscale particle morphology in granular soils depends on the initial packing state and the confining stress condition.For the same granular soils,rougher particle surfaces with a denser initial packing state and a higher confining stress condition result in a higher degree of shear strain localisation.
基金This work was supported by the Natural Science Foundation of Guangdong Province,China(2018 A0303131000)the project of Academician workstation of Guangdong Province,China(2014B090905001)the Fundamental Research Funds for the Central Universities,China(21617406)and the key project of Scientific and Technological projects of Guang Zhou,China(201604040007,201604020168).
文摘This study aimed to explore the application of surface-enhanced Raman scattering(SERS)in the rapid diagnosis of gastric cancer.The SERS spectra of 68 serum samples from gastric cancer patients and healthy volunteers were acquired.The characteristic ratio method(CRM)and principal component analysis(PCA)were used to differentiate gastric cancer serum from normal serum.Compared with healthy volunteers,the serum SERS intensity of gastric cancer patients was relatively high at 722 cm^(-1),while it was relatively low at 588,644,861,1008,1235,1397,1445 and 1586 cm^(-1).These results indicated that the relative content of nucleic acids in the serum of gastric cancer patients rises while the relative content of amino acids and carbohydrates decreases.In PCA,the sensitivity and specificity of discriminating gastric cancer were 94.1%and 94.1%,respectively,with the accuracy of 94.1%.Based on the intensity ratios of four characteristic peaks at 722,861,1008 and 1397 cm^(-1),CRM presented the diagnostic sensitivity and specificity of 100%and 97.4%,respectively,and the accuracy of 98.5%.Therefore,the three peak intensity ratios of I_(722)/I_(861),I_(722)/I_(1008)and I_(722)/I_(1397)can be considered as biologicalfingerprint information for gastric cancer diagnosis and can rapidly and directly reflect the physiological and pathological changes associated with gastric cancer development.This study provides an important basis and standards for the early diagnosis of gastric cancer.
基金supported by the National Natural Science Foundationof China(51275348)
文摘Recovering the low-rank structure of data matrix from sparse errors arises in the principal component pursuit (PCP). This paper exploits the higher-order generalization of matrix recovery, named higher-order principal component pursuit (HOPCP), since it is critical in multi-way data analysis. Unlike the convexification (nuclear norm) for matrix rank function, the tensorial nuclear norm is stil an open problem. While existing preliminary works on the tensor completion field provide a viable way to indicate the low complexity estimate of tensor, therefore, the paper focuses on the low multi-linear rank tensor and adopt its convex relaxation to formulate the convex optimization model of HOPCP. The paper further propose two algorithms for HOPCP based on alternative minimization scheme: the augmented Lagrangian alternating direction method (ALADM) and its truncated higher-order singular value decomposition (ALADM-THOSVD) version. The former can obtain a high accuracy solution while the latter is more efficient to handle the computationally intractable problems. Experimental results on both synthetic data and real magnetic resonance imaging data show the applicability of our algorithms in high-dimensional tensor data processing.
基金jointly supported by the Gansu Provincial Natural Resources Science and Technology Project of the Key Laboratory of Strategic Mineral Resources of the Upper Yellow River,Ministry of Natural Resources(YSJD2022-16)the survey project initiated by the China Geological Survey(DD20211347).
文摘In this paper,25 sampling points of overlying deposits in Tonglushan mining area,Daye City,Hubei Province,China were tested for heavy metal content to explore pollution characteristics,pollution sources and ecological risks of heavy metals in sediments.A geo-accumulation index method was used to evaluate the degree of heavy metal pollution in the sediment.The mean sediment quality guideline quotient was used for evaluating the ecological risk level of heavy metal in the sediment.And a method of correlation analysis,clustering analysis,and principal component analysis was used for preliminary analysis on the source of heavy metal in the sediment.It was indicated that there was extremely heavy metal pollution in the sediment,among which Cd was extremely polluted,Cu strongly contaminated,Zn,As,and Hg moderately contaminated,and Pb,Cr,and Ni were slightly contaminated.It was also indicated by the mean sediment quality guideline-quotient result that there was a high ecological risk of heavy metals in the sediment,and 64%of the sample sites had extremely high hidden biotoxic effects.For distribution,the contamination of branches was worse than that of the main channel of Daye Dagang,and the deposition of each heavy metal was mainly influenced by the distance from this sample site to the sewage draining exit of a tailings pond.The source analysis showed that the heavy metals in the sediment come from pollution discharging of mining and beneficiation companies,tailings ponds,smelting companies,and transport vehicles.In the study area,due to the influence of heavy metal discharging from these sources,the ecotoxicity of heavy metals in the sediment was extremely high,and Cd was the most toxic pollutant.The research figured out the key restoration area and elements for ecological restoration in the sediment of the Tonglüshan mining area,which could be referenced by monitoring and governance of heavy metal pollution in the sediment of the polymetallic mining area.
文摘Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.
基金Supported by National Natural Science Foundation of China (No.51275348)College Students Innovation and Entrepreneurship Training Program of Tianjin University (No.201210056339)
文摘In this paper, a unified matrix recovery model was proposed for diverse corrupted matrices. Resulting from the separable structure of the proposed model, the convex optimization problem can be solved efficiently by adopting an inexact augmented Lagrange multiplier (IALM) method. Additionally, a random projection accelerated technique (IALM+RP) was adopted to improve the success rate. From the preliminary numerical comparisons, it was indicated that for the standard robust principal component analysis (PCA) problem, IALM+RP was at least two to six times faster than IALM with an insignificant reduction in accuracy; and for the outlier pursuit (OP) problem, IALM+RP was at least 6.9 times faster, even up to 8.3 times faster when the size of matrix was 2 000×2 000.
文摘In order to investigate the eutrophication degree of Yuqiao Reservoir, a hybrid method, combining principal component regression (PCR) and artificial neural network (ANN), was adopted to predict chlorophyll-a concentration of Yuqiao Reservoir’s outflow. The data were obtained from two sampling sites, site 1 in the reservoir, and site 2 near the dam. Seven water variables, namely chlorophyll-a concentration of site 2 at time t and that of both sites 10 days before t, total phosphorus(TP), total nitrogen(TN), dissolved oxygen(DO), and temperature from January 2000 to September 2002, were utilized to develop models. To remove the collinearity between the variables, principal components extracted by principal component analysis were employed as predictors for models. The performance of models was assessed by the square of correlation coefficient, mean absolute error (MAE), root mean square error (RMSE) and average absolute relative error (AARE). Results show that the hybrid method has achieved more accurate prediction than PCR or ANN model. Finally, the three models were applied to predicting the chlorophyll-a concentration in 2003. The predictions of the hybrid method were found to be consistent with the observed values all year round, while the results of PCR and ANN models did not fit quite well from July to October.
基金supported by the National Natural Science Foundation of China(Grant No.42002134)China Postdoctoral Science Foundation(Grant No.2021T140735)Science Foundation of China University of Petroleum,Beijing(Grant Nos.2462020XKJS02 and 2462020YXZZ004).
文摘How to fit a properly nonlinear classification model from conventional well logs to lithofacies is a key problem for machine learning methods.Kernel methods(e.g.,KFD,SVM,MSVM)are effective attempts to solve this issue due to abilities of handling nonlinear features by kernel functions.Deep mining of log features indicating lithofacies still needs to be improved for kernel methods.Hence,this work employs deep neural networks to enhance the kernel principal component analysis(KPCA)method and proposes a deep kernel method(DKM)for lithofacies identification using well logs.DKM includes a feature extractor and a classifier.The feature extractor consists of a series of KPCA models arranged according to residual network structure.A gradient-free optimization method is introduced to automatically optimize parameters and structure in DKM,which can avoid complex tuning of parameters in models.To test the validation of the proposed DKM for lithofacies identification,an open-sourced dataset with seven con-ventional logs(GR,CAL,AC,DEN,CNL,LLD,and LLS)and lithofacies labels from the Daniudi Gas Field in China is used.There are eight lithofacies,namely clastic rocks(pebbly,coarse,medium,and fine sand-stone,siltstone,mudstone),coal,and carbonate rocks.The comparisons between DKM and three commonly used kernel methods(KFD,SVM,MSVM)show that(1)DKM(85.7%)outperforms SVM(77%),KFD(79.5%),and MSVM(82.8%)in accuracy of lithofacies identification;(2)DKM is about twice faster than the multi-kernel method(MSVM)with good accuracy.The blind well test in Well D13 indicates that compared with the other three methods DKM improves about 24%in accuracy,35%in precision,41%in recall,and 40%in F1 score,respectively.In general,DKM is an effective method for complex lithofacies identification.This work also discussed the optimal structure and classifier for DKM.Experimental re-sults show that(m_(1),m_(2),O)is the optimal model structure and linear svM is the optimal classifier.(m_(1),m_(2),O)means there are m KPCAs,and then m2 residual units.A workflow to determine an optimal classifier in DKM for lithofacies identification is proposed,too.
文摘This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a convex approximation of the rank operator under mild condition. However, most nuclear norm minimization approaches are based on SVD operations. Given a matrix , the time complexity of SVD operation is O(mn2), which brings prohibitive computational complexity in large-scale problems. In this paper, an efficient and scalable algorithm for tensor principal component analysis is proposed which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis (LADMVTPCA). Different from traditional matrix factorization methods, LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors, which greatly improves the computational efficacy compared to matrix factorization method. In the experiment part, synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA. Results have shown that LADMVTPCA outperforms matrix factorization based method.
基金Supported by the National Natural Science Foundation of China(41901012)Project of Shaanxi Provincial Education Department(21JP040)+1 种基金Talent Fund Project of Weinan Normal University(2021RC04)National Innovation and Entrepreneurship Training Program for College Students(22XK019)。
文摘In order to study the water quality of the Shichuan River basin in Fuping,Shaanxi Province,based on improved Nemerow index method,comprehensive pollution index method and principal component analysis method,eight water quality indexes such as pH,dissolved oxygen(DO),total dissolved solids(TDS),COD,total hardness,total phosphorus,total nitrogen and Zn in three monitoring sections of Fuping section of the Shichuan River in Shaanxi Province were detected and analyzed.The results show that the water quality of the surface water in the Shichuan River basin is gradeⅢorⅣwater,that is,the water is slightly polluted and moderately polluted.It is necessary to monitor the water quality after regulation and clarify the main factors causing the water pollution.
文摘Having researched for many years, seismologists in China presented about 80 earthquake prediction factors which reflected omen information of earthquake. How to concentrate the information that the 80 earthquake prediction factors have and how to choose the main factors to predict earthquakes precisely have become one of the topics in seismology. The model of principal component-discrimination consists of principal component analysis, correlation analysis, weighted method of principal factor coefficients and Mahalanobis distance discrimination analysis. This model combines the method of maximization earthquake prediction factor information with the weighted method of principal factor coefficients and correlation analysis to choose earthquake prediction variables, applying Mahalanobis distance discrimination to establishing earthquake prediction discrimination model. This model was applied to analyzing the earthquake data of Northern China area and obtained good prediction results.
基金Funded by 973 Program of Ministry of National Defense of China(Grant No.613237)
文摘This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverage time, the frequency of coverage, average time per coverage and maximum coverage gap, etc. is required simultaneously. By introducing index normalization method to convert performance indices into dimensionless variables within the range of [0, 1], a design optimization method based on the principal component analysis and cluster analysis is proposed, which consists of index normalization method, principal component analysis, multiple-level cluster analysis and weighted evaluation method. The results of orbit optimization for earth observation satellites show that the optimal orbit can be obtained by using the proposed method. The principal component analysis can reduce the total number of indices with a non-independent relationship to save computing time. Similarly, the multiple-level cluster analysis with parallel computing could save computing time.