In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different ...In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different sources are collected at different sampling rates.To build a complete process monitoring strategy,all these multi-rate measurements should be considered for data-based modeling and monitoring.In this paper,a novel kernel multi-rate probabilistic principal component analysis(K-MPPCA)model is proposed to extract the nonlinear correlations among different sampling rates.In the proposed model,the model parameters are calibrated using the kernel trick and the expectation-maximum(EM)algorithm.Also,the corresponding fault detection methods based on the nonlinear features are developed.Finally,a simulated nonlinear case and an actual pre-decarburization unit in the ammonia synthesis process are tested to demonstrate the efficiency of the proposed method.展开更多
This study aimed to explore the application of surface-enhanced Raman scattering(SERS)in the rapid diagnosis of gastric cancer.The SERS spectra of 68 serum samples from gastric cancer patients and healthy volunteers w...This study aimed to explore the application of surface-enhanced Raman scattering(SERS)in the rapid diagnosis of gastric cancer.The SERS spectra of 68 serum samples from gastric cancer patients and healthy volunteers were acquired.The characteristic ratio method(CRM)and principal component analysis(PCA)were used to differentiate gastric cancer serum from normal serum.Compared with healthy volunteers,the serum SERS intensity of gastric cancer patients was relatively high at 722 cm^(-1),while it was relatively low at 588,644,861,1008,1235,1397,1445 and 1586 cm^(-1).These results indicated that the relative content of nucleic acids in the serum of gastric cancer patients rises while the relative content of amino acids and carbohydrates decreases.In PCA,the sensitivity and specificity of discriminating gastric cancer were 94.1%and 94.1%,respectively,with the accuracy of 94.1%.Based on the intensity ratios of four characteristic peaks at 722,861,1008 and 1397 cm^(-1),CRM presented the diagnostic sensitivity and specificity of 100%and 97.4%,respectively,and the accuracy of 98.5%.Therefore,the three peak intensity ratios of I_(722)/I_(861),I_(722)/I_(1008)and I_(722)/I_(1397)can be considered as biologicalfingerprint information for gastric cancer diagnosis and can rapidly and directly reflect the physiological and pathological changes associated with gastric cancer development.This study provides an important basis and standards for the early diagnosis of gastric cancer.展开更多
Regional environmental carrying capacity (ECC) is nonlinear and spatially specific. A hierarchy index system including resources, environmental and socio-economic elements was established using an analytic hierarchy p...Regional environmental carrying capacity (ECC) is nonlinear and spatially specific. A hierarchy index system including resources, environmental and socio-economic elements was established using an analytic hierarchy process. Principal component analysis (PCA) was used to estimate the regional size and differences of environmental carrying capacities. Main information of four principal components, i.e., carrying capacity of resources supply, carrying capacity of environmental quality, carrying capacity of social economy and carrying capacity of infrastructure construction, was extracted. The ECC evaluation value was divided into five levels of lowest carrying capacity, low carrying capacity, medium carrying capacity, high carrying capacity and highest carrying capacity, respectively. The results showed that on the whole ECC was at the medium carrying capacity level. ECC was generally highest in Guanzhong plain, followed by Loess Plateau, and was lowest in Qiba mountain. The carrying capacity of water resources and environmental quality was relatively low, and the infrastructure carrying capacity was highest among the four components. The temporal spatial variation of ECC was closely related to vulnerability of the natural resources and environment in the regions. Verification was proven that PCA was a useful tool when applied to evaluate ECC and reflect the spatial distribution of large-quantity ECC indices on a large regional scale. This study provides a basis for comprehensive understanding of resources, environment and management for regional balanced development.展开更多
This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a c...This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a convex approximation of the rank operator under mild condition. However, most nuclear norm minimization approaches are based on SVD operations. Given a matrix , the time complexity of SVD operation is O(mn2), which brings prohibitive computational complexity in large-scale problems. In this paper, an efficient and scalable algorithm for tensor principal component analysis is proposed which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis (LADMVTPCA). Different from traditional matrix factorization methods, LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors, which greatly improves the computational efficacy compared to matrix factorization method. In the experiment part, synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA. Results have shown that LADMVTPCA outperforms matrix factorization based method.展开更多
After 30 years of economic development, the high-tech industry has played </span><span style="font-family:Verdana;">an </span><span style="font-family:Verdana;">important ro...After 30 years of economic development, the high-tech industry has played </span><span style="font-family:Verdana;">an </span><span style="font-family:Verdana;">important role in China’s national economy. The development of high-level</span><span style="font-family:"font-size:10pt;"> </span><span style="font-family:Verdana;">technological industry plays a leading role in guiding the transformation of </span><span style="font-family:Verdana;">China’s economy from “investment-driven” to “technology-driven”. The</span><span style="font-family:Verdana;"> high-tech industry represents the future industrial development direction and plays a positive role in promoting the transformation of traditional industries. The rapid development of high-tech industry is the key to social progress. In this paper, the traditional analytical model of statistics is combined with principal component analysis and spatial analysis, and R language is used to express the analytical results intuitively on the map. Finally, a comprehensive evaluation is established.展开更多
In order to study the water quality of the Shichuan River basin in Fuping,Shaanxi Province,based on improved Nemerow index method,comprehensive pollution index method and principal component analysis method,eight wate...In order to study the water quality of the Shichuan River basin in Fuping,Shaanxi Province,based on improved Nemerow index method,comprehensive pollution index method and principal component analysis method,eight water quality indexes such as pH,dissolved oxygen(DO),total dissolved solids(TDS),COD,total hardness,total phosphorus,total nitrogen and Zn in three monitoring sections of Fuping section of the Shichuan River in Shaanxi Province were detected and analyzed.The results show that the water quality of the surface water in the Shichuan River basin is gradeⅢorⅣwater,that is,the water is slightly polluted and moderately polluted.It is necessary to monitor the water quality after regulation and clarify the main factors causing the water pollution.展开更多
Water quality of Litani River was deteriorated due to rapid population growth and industrial and agricultural activity. Multivariate analysis of spatio-temporal variation of water quality is useful to improve the proj...Water quality of Litani River was deteriorated due to rapid population growth and industrial and agricultural activity. Multivariate analysis of spatio-temporal variation of water quality is useful to improve the projects of water quality management and treatment of the river. In this work, analysis of samples from different locations at different seasons was investigated. The spatio-temporal variation of physico-chemical parameters of the water was determined. A total of 11 water quality parameters were monitored over 12 months during 2018 at 3 sites located in different areas of the river. Multivariate statistical techniques were used to study the spatio-temporal evolution of the studied parameters and the correlation between the different factors. Principal Component Analysis (PCA) was applied to the responsible factors for water quality variations during wet and dry periods. The multivariate analysis of variance (MANOVA) was also applied to the same factors and gives the best results for both spatial and temporal analysis. A black point of agricultural, industrial and sewage water pollution was identified in Jeb-Jennine station from the high concentrations of ammonia, sulfate and phosphate. This difference was proved by the major changes in the values of the parameters from one station to the other. Jeb-Jennine represents a main pollution area in the river. The high ammonia, sulfate and phosphate concentrations result from the important agricultural, industrial and sewage water pollution in the area. A high bacterial activity was highlighted in Jeb-Jennine and Quaroun stations because of the presence of the high nitrite concentrations in the two locations. All parameters are highly affected by climate factors, especially temperature and precipitation. TDS, salinity, electrical conductivity and the concentrations of all pollutants increase during wet season affected by the runoff. Other factors can affect the water quality of the river for example geographical features of the region and seasonal human activity like tourism. The correlation between different parameters was evaluated using PCA statistical method. This correlation is not stable, and evolves between wet and dry season.展开更多
Based on related statistical data during 1980-2014,change rule of Guangxi cultivated land pressure level was studied. Taking each municipal administrative division as evaluation unit,temporal-spatial change trend of c...Based on related statistical data during 1980-2014,change rule of Guangxi cultivated land pressure level was studied. Taking each municipal administrative division as evaluation unit,temporal-spatial change trend of cultivated land pressure level was explored by establishing pressure index model of cultivated land,and principal component analysis was used to explore the driving force of cultivated land pressure. Results showed that from 1980 to 2014 in Guangxi,cultivated land pressure was at level one in 12 years,level two in 19 years and level three in 4 years; mean of cultivated land pressure in each city during 2005-2014 was taken as average level of cultivated land pressure in the city,in which cultivated land pressure values of Chongzuo City,Baise City,Laibin City,Liuzhou City,Fangchenggang City,Nanning City,Hechi City and Guigang City were all lower than average level in Guangxi at the same period. Driving factors of cultivated land pressure index mainly contained urbanization rate,Engel coefficient of rural households(ECRH),per capita cultivated land area,total population and rural per capita net income(RPFI).展开更多
This paper provides a generalizable mode for the ecological vulnerability evaluation for tourism planning and development in high mountain areas.The Bayi District located in southeastern Tibet is taken as a typical to...This paper provides a generalizable mode for the ecological vulnerability evaluation for tourism planning and development in high mountain areas.The Bayi District located in southeastern Tibet is taken as a typical town to study the conflict between the protection of natural ecological environment and the exploitation of tourism resources. Based on the Sensitivity-Recovery-Pressure(SRP) framework, a set of vulnerability evaluation systems for plateau tourism regions were developed. The spatial principal component analysis(SPCA), remote sensing and GIS technologies were integrated to apply for spatial quantification of evaluation index system. The ecological vulnerability of the Bayi District was divided into five levels: potential, mild, moderate,severe, and extreme, and our results showed that significantly severe and extreme vulnerability areas were mainly distributed throughout the southwestern and central northern alpine pasture and glacial zones.Potential and mild vulnerability areas were mainly distributed in the vicinity of the Yarlung Zangbo River tributary basin. Then three tourism development and environmental protection zones were classified and appropriate measures for the protection were proposed. It also provides a reference for the spatial distribution of a range of areas that require different protection measures according to ecological vulnerability classification.展开更多
Chemometric approach based on principal component analysis(PCA) was utilized to examine the spatial variances of environmental and ecological characteristics in the Zhujiang River(Pearl River) Estuary and adjacent...Chemometric approach based on principal component analysis(PCA) was utilized to examine the spatial variances of environmental and ecological characteristics in the Zhujiang River(Pearl River) Estuary and adjacent waters(ZREAW) in the South China Sea. The PCA result shows that the ZREAW can be divided into different zones according to the principal components and geographical locations of the study stations,and indicates that there are distinct regional variances on environmental features and the corresponding phytoplankton biomass and community structures among different areas. The spatial distribution of ecological features was implied to be influenced by various degrees of the different water resources,such as the Pearl River discharges,the coastal current and the oceanic water from the South China Sea. The variation of the biomass maximum zone and the complex impacts on the spatial distributions of phytoplankton biomass and production were also evaluated.展开更多
Comprehensive and joint applications of GIS and chemometric approach were applied in identification and spatial patterns of coastal water pollution sources with a large data set (5 years (2000-2004), 17 parameters...Comprehensive and joint applications of GIS and chemometric approach were applied in identification and spatial patterns of coastal water pollution sources with a large data set (5 years (2000-2004), 17 parameters) obtained through coastal water monitoring of Southern Water Control Zone in Hong Kong. According to cluster analysis the pollution degree was significantly different between September-next May (the 1st period) and June-August (the 2nd period). Based on these results, four potential pollution sources, such as organic/eutrophication pollution, natural pollution, mineral/anthropic pollution and fecal pollution were identified by factor analysis/principal component analysis. Then the factor scores of each monitoring site were analyzed using inverse distance weighting method, and the results indicated degree of the influence by various potential pollution sources differed among the monitoring sites. This study indicated that hybrid approach was useful and effective for identification of coastal water pollution source and spatial patterns.展开更多
How to fit a properly nonlinear classification model from conventional well logs to lithofacies is a key problem for machine learning methods.Kernel methods(e.g.,KFD,SVM,MSVM)are effective attempts to solve this issue...How to fit a properly nonlinear classification model from conventional well logs to lithofacies is a key problem for machine learning methods.Kernel methods(e.g.,KFD,SVM,MSVM)are effective attempts to solve this issue due to abilities of handling nonlinear features by kernel functions.Deep mining of log features indicating lithofacies still needs to be improved for kernel methods.Hence,this work employs deep neural networks to enhance the kernel principal component analysis(KPCA)method and proposes a deep kernel method(DKM)for lithofacies identification using well logs.DKM includes a feature extractor and a classifier.The feature extractor consists of a series of KPCA models arranged according to residual network structure.A gradient-free optimization method is introduced to automatically optimize parameters and structure in DKM,which can avoid complex tuning of parameters in models.To test the validation of the proposed DKM for lithofacies identification,an open-sourced dataset with seven con-ventional logs(GR,CAL,AC,DEN,CNL,LLD,and LLS)and lithofacies labels from the Daniudi Gas Field in China is used.There are eight lithofacies,namely clastic rocks(pebbly,coarse,medium,and fine sand-stone,siltstone,mudstone),coal,and carbonate rocks.The comparisons between DKM and three commonly used kernel methods(KFD,SVM,MSVM)show that(1)DKM(85.7%)outperforms SVM(77%),KFD(79.5%),and MSVM(82.8%)in accuracy of lithofacies identification;(2)DKM is about twice faster than the multi-kernel method(MSVM)with good accuracy.The blind well test in Well D13 indicates that compared with the other three methods DKM improves about 24%in accuracy,35%in precision,41%in recall,and 40%in F1 score,respectively.In general,DKM is an effective method for complex lithofacies identification.This work also discussed the optimal structure and classifier for DKM.Experimental re-sults show that(m_(1),m_(2),O)is the optimal model structure and linear svM is the optimal classifier.(m_(1),m_(2),O)means there are m KPCAs,and then m2 residual units.A workflow to determine an optimal classifier in DKM for lithofacies identification is proposed,too.展开更多
Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input da...Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.展开更多
In this paper,25 sampling points of overlying deposits in Tonglushan mining area,Daye City,Hubei Province,China were tested for heavy metal content to explore pollution characteristics,pollution sources and ecological...In this paper,25 sampling points of overlying deposits in Tonglushan mining area,Daye City,Hubei Province,China were tested for heavy metal content to explore pollution characteristics,pollution sources and ecological risks of heavy metals in sediments.A geo-accumulation index method was used to evaluate the degree of heavy metal pollution in the sediment.The mean sediment quality guideline quotient was used for evaluating the ecological risk level of heavy metal in the sediment.And a method of correlation analysis,clustering analysis,and principal component analysis was used for preliminary analysis on the source of heavy metal in the sediment.It was indicated that there was extremely heavy metal pollution in the sediment,among which Cd was extremely polluted,Cu strongly contaminated,Zn,As,and Hg moderately contaminated,and Pb,Cr,and Ni were slightly contaminated.It was also indicated by the mean sediment quality guideline-quotient result that there was a high ecological risk of heavy metals in the sediment,and 64%of the sample sites had extremely high hidden biotoxic effects.For distribution,the contamination of branches was worse than that of the main channel of Daye Dagang,and the deposition of each heavy metal was mainly influenced by the distance from this sample site to the sewage draining exit of a tailings pond.The source analysis showed that the heavy metals in the sediment come from pollution discharging of mining and beneficiation companies,tailings ponds,smelting companies,and transport vehicles.In the study area,due to the influence of heavy metal discharging from these sources,the ecotoxicity of heavy metals in the sediment was extremely high,and Cd was the most toxic pollutant.The research figured out the key restoration area and elements for ecological restoration in the sediment of the Tonglüshan mining area,which could be referenced by monitoring and governance of heavy metal pollution in the sediment of the polymetallic mining area.展开更多
This paper presents a study on the biotic/abiotic conditions of the S?o Giácomo sanitary landfill, located near the city of Caxias do Sul, Brazil, through statistical analysis of fourteen physic-chemical data set...This paper presents a study on the biotic/abiotic conditions of the S?o Giácomo sanitary landfill, located near the city of Caxias do Sul, Brazil, through statistical analysis of fourteen physic-chemical data sets for the leachate, produced in the garbage dump site over a long period of years. Different chemometric methods are used in the statistical analysis. For example, the correlations between the variables, related to the degraded organic matter and biological activity, are determined by means of multivariate methods. The results highlight that BOD, COD, VTS, FTS and TS give information on the anaerobic degradation of the organic matter contained in the cells, and suggest that the greater the contribution of the variables with positive weights in PC1 the greater the level of organic matter degradation. The variables TN, Amon Nit. and alkalinity are related to the biological activity and determine the potency of the variables in relation to time. The greater the contribution of the variables related to organic degradation the greater the values in PC2 and the lesser the potency of these variables, whose influence is greater in the second stage of anaerobic degradation. The variables of PC2 is important plans of the contamination of the leached in the bodies hídrics.展开更多
A habitat model has been widely used to manage marine species and analyze relationship between species distribution and environmental factors.The predictive skill in habitat model depends on whether the models include...A habitat model has been widely used to manage marine species and analyze relationship between species distribution and environmental factors.The predictive skill in habitat model depends on whether the models include appropriate explanatory variables.Due to limited habitat range,low density,and low detection rate,the number of zero catches could be very large even in favorable habitats.Excessive zeroes will increase the bias and uncertainty in estimation of habitat.Therefore,appropriate explanatory variables need to be chosen first to prevent underestimate or overestimate species abundance in habitat models.In addition,biotic variables such as prey data and spatial autocovariate(SAC)of target species are often ignored in species distribution models.Therefore,we evaluated the eff ects of input variables on the performance of generalized additive models(GAMs)under excessive zero catch(>70%).Five types of input variables were selected,i.e.,(1)abiotic variables,(2)abiotic and biotic variables,(3)abiotic variables and SAC,(4)abiotic,biotic variables and SAC,and(5)principal component analysis(PCA)based abiotic and biotic variables and SAC.Belanger’s croaker Johnius belangerii is one of the dominant demersal fish in Haizhou Bay,with a large number of zero catches,thus was used for the case study.Results show that the PCA-based GAM incorporated with abiotic and biotic variables and SAC was the most appropriate model to quantify the spatial distribution of the croaker.Biotic variables and SAC were important and should be incorporated as one of the drivers to predict species distribution.Our study suggests that the process of input variables is critical to habitat modelling,which could improve the performance of habitat models and enhance our understanding of the habitat suitability of target species.展开更多
The Heilongjiang Jianbiannongchang area is located at the confluence of the Great and Lesser Xing’an Ranges.This area has a complex magmatic and tectonic evolutionary history that has resulted in a complex and divers...The Heilongjiang Jianbiannongchang area is located at the confluence of the Great and Lesser Xing’an Ranges.This area has a complex magmatic and tectonic evolutionary history that has resulted in a complex and diverse geological background for mineralization.In this study,isometric logarithmic ratio(ILR)transformations of Au,Cu,Pb,Zn,and Sb contents were performed in the1:50,000 soil geochemical data of the Jianbiannongchang area.Robust principal component analysis(RPCA)was conducted based on ILR transformation.The local singularity and spectrum-area(S-A)methods were used to extract information on mineralogic anomalies.The results showed that:(1)the transformed data eliminated the influence of the original data closure effect,and the PC1and PC2 information obtained by applying RPCA reflected ore-producing element anomalies dominated by Au and Cu.(2)The local singularity method can enhance the information of the local strong and weak slow anomalies.After performing local singularity analysis on PC1 and PC2,the obtained local anomalies reflected the local singularity spatial anomaly patterns related to Cu and Au mineralization in this area,which is an effective method for trapping ore-producing anomalies.(3)Furthermore,the composite anomaly decomposition of PC1 and PC2 was performed using the S-A method,and the screened anomalous and background fields reflect the ore-producing anomalies related to Cu and Au mineralization.This information is in agreement with known Cu and Au mineralization.(4)The geochemical anomalies with mineralization potential were obtained outside the known mineralization sites by integrating the information of oreproducing anomalies extracted by the local singularity and S-A methods,providing the theoretical basis and exploration direction for future exploration in the study area.展开更多
With the continuous development of urbanization in China,the country’s growing population brings great challenges to urban development.By mastering the refined population spatial distribution in administrative units,...With the continuous development of urbanization in China,the country’s growing population brings great challenges to urban development.By mastering the refined population spatial distribution in administrative units,the quantity and agglomeration of population distribution can be estimated and visualized.It will provide a basis for a more rational urban planning.This paper takes Beijing as the research area and uses a new Luojia1-01 nighttime light image with high resolution,land use type data,Points of Interest(POI)data,and other data to construct the population spatial index system,establishing the index weight based on the principal component analysis.The comprehensive weight value of population distribution in the study area was then used to calculate the street population distribution of Beijing in 2018.Then the population spatial distribution was visualize using GIS technology.After accuracy assessments by comparing the result with the WorldPop data,the accuracy has reached 0.74.The proposed method was validated as a qualified method to generate population spatial maps.By contrast of local areas,Luojia 1-01 data is more suitable for population distribution estimation than the NPP/VIIRS(Net Primary Productivity/Visible infrared Imaging Radiometer)nighttime light data.More geospatial big data and mathematical models can be combined to create more accurate population maps in the future.展开更多
基金supported by Zhejiang Provincial Natural Science Foundation of China(LY19F030003)Key Research and Development Project of Zhejiang Province(2021C04030)+1 种基金the National Natural Science Foundation of China(62003306)Educational Commission Research Program of Zhejiang Province(Y202044842)。
文摘In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different sources are collected at different sampling rates.To build a complete process monitoring strategy,all these multi-rate measurements should be considered for data-based modeling and monitoring.In this paper,a novel kernel multi-rate probabilistic principal component analysis(K-MPPCA)model is proposed to extract the nonlinear correlations among different sampling rates.In the proposed model,the model parameters are calibrated using the kernel trick and the expectation-maximum(EM)algorithm.Also,the corresponding fault detection methods based on the nonlinear features are developed.Finally,a simulated nonlinear case and an actual pre-decarburization unit in the ammonia synthesis process are tested to demonstrate the efficiency of the proposed method.
基金This work was supported by the Natural Science Foundation of Guangdong Province,China(2018 A0303131000)the project of Academician workstation of Guangdong Province,China(2014B090905001)the Fundamental Research Funds for the Central Universities,China(21617406)and the key project of Scientific and Technological projects of Guang Zhou,China(201604040007,201604020168).
文摘This study aimed to explore the application of surface-enhanced Raman scattering(SERS)in the rapid diagnosis of gastric cancer.The SERS spectra of 68 serum samples from gastric cancer patients and healthy volunteers were acquired.The characteristic ratio method(CRM)and principal component analysis(PCA)were used to differentiate gastric cancer serum from normal serum.Compared with healthy volunteers,the serum SERS intensity of gastric cancer patients was relatively high at 722 cm^(-1),while it was relatively low at 588,644,861,1008,1235,1397,1445 and 1586 cm^(-1).These results indicated that the relative content of nucleic acids in the serum of gastric cancer patients rises while the relative content of amino acids and carbohydrates decreases.In PCA,the sensitivity and specificity of discriminating gastric cancer were 94.1%and 94.1%,respectively,with the accuracy of 94.1%.Based on the intensity ratios of four characteristic peaks at 722,861,1008 and 1397 cm^(-1),CRM presented the diagnostic sensitivity and specificity of 100%and 97.4%,respectively,and the accuracy of 98.5%.Therefore,the three peak intensity ratios of I_(722)/I_(861),I_(722)/I_(1008)and I_(722)/I_(1397)can be considered as biologicalfingerprint information for gastric cancer diagnosis and can rapidly and directly reflect the physiological and pathological changes associated with gastric cancer development.This study provides an important basis and standards for the early diagnosis of gastric cancer.
文摘Regional environmental carrying capacity (ECC) is nonlinear and spatially specific. A hierarchy index system including resources, environmental and socio-economic elements was established using an analytic hierarchy process. Principal component analysis (PCA) was used to estimate the regional size and differences of environmental carrying capacities. Main information of four principal components, i.e., carrying capacity of resources supply, carrying capacity of environmental quality, carrying capacity of social economy and carrying capacity of infrastructure construction, was extracted. The ECC evaluation value was divided into five levels of lowest carrying capacity, low carrying capacity, medium carrying capacity, high carrying capacity and highest carrying capacity, respectively. The results showed that on the whole ECC was at the medium carrying capacity level. ECC was generally highest in Guanzhong plain, followed by Loess Plateau, and was lowest in Qiba mountain. The carrying capacity of water resources and environmental quality was relatively low, and the infrastructure carrying capacity was highest among the four components. The temporal spatial variation of ECC was closely related to vulnerability of the natural resources and environment in the regions. Verification was proven that PCA was a useful tool when applied to evaluate ECC and reflect the spatial distribution of large-quantity ECC indices on a large regional scale. This study provides a basis for comprehensive understanding of resources, environment and management for regional balanced development.
文摘This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a convex approximation of the rank operator under mild condition. However, most nuclear norm minimization approaches are based on SVD operations. Given a matrix , the time complexity of SVD operation is O(mn2), which brings prohibitive computational complexity in large-scale problems. In this paper, an efficient and scalable algorithm for tensor principal component analysis is proposed which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis (LADMVTPCA). Different from traditional matrix factorization methods, LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors, which greatly improves the computational efficacy compared to matrix factorization method. In the experiment part, synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA. Results have shown that LADMVTPCA outperforms matrix factorization based method.
文摘After 30 years of economic development, the high-tech industry has played </span><span style="font-family:Verdana;">an </span><span style="font-family:Verdana;">important role in China’s national economy. The development of high-level</span><span style="font-family:"font-size:10pt;"> </span><span style="font-family:Verdana;">technological industry plays a leading role in guiding the transformation of </span><span style="font-family:Verdana;">China’s economy from “investment-driven” to “technology-driven”. The</span><span style="font-family:Verdana;"> high-tech industry represents the future industrial development direction and plays a positive role in promoting the transformation of traditional industries. The rapid development of high-tech industry is the key to social progress. In this paper, the traditional analytical model of statistics is combined with principal component analysis and spatial analysis, and R language is used to express the analytical results intuitively on the map. Finally, a comprehensive evaluation is established.
基金Supported by the National Natural Science Foundation of China(41901012)Project of Shaanxi Provincial Education Department(21JP040)+1 种基金Talent Fund Project of Weinan Normal University(2021RC04)National Innovation and Entrepreneurship Training Program for College Students(22XK019)。
文摘In order to study the water quality of the Shichuan River basin in Fuping,Shaanxi Province,based on improved Nemerow index method,comprehensive pollution index method and principal component analysis method,eight water quality indexes such as pH,dissolved oxygen(DO),total dissolved solids(TDS),COD,total hardness,total phosphorus,total nitrogen and Zn in three monitoring sections of Fuping section of the Shichuan River in Shaanxi Province were detected and analyzed.The results show that the water quality of the surface water in the Shichuan River basin is gradeⅢorⅣwater,that is,the water is slightly polluted and moderately polluted.It is necessary to monitor the water quality after regulation and clarify the main factors causing the water pollution.
文摘Water quality of Litani River was deteriorated due to rapid population growth and industrial and agricultural activity. Multivariate analysis of spatio-temporal variation of water quality is useful to improve the projects of water quality management and treatment of the river. In this work, analysis of samples from different locations at different seasons was investigated. The spatio-temporal variation of physico-chemical parameters of the water was determined. A total of 11 water quality parameters were monitored over 12 months during 2018 at 3 sites located in different areas of the river. Multivariate statistical techniques were used to study the spatio-temporal evolution of the studied parameters and the correlation between the different factors. Principal Component Analysis (PCA) was applied to the responsible factors for water quality variations during wet and dry periods. The multivariate analysis of variance (MANOVA) was also applied to the same factors and gives the best results for both spatial and temporal analysis. A black point of agricultural, industrial and sewage water pollution was identified in Jeb-Jennine station from the high concentrations of ammonia, sulfate and phosphate. This difference was proved by the major changes in the values of the parameters from one station to the other. Jeb-Jennine represents a main pollution area in the river. The high ammonia, sulfate and phosphate concentrations result from the important agricultural, industrial and sewage water pollution in the area. A high bacterial activity was highlighted in Jeb-Jennine and Quaroun stations because of the presence of the high nitrite concentrations in the two locations. All parameters are highly affected by climate factors, especially temperature and precipitation. TDS, salinity, electrical conductivity and the concentrations of all pollutants increase during wet season affected by the runoff. Other factors can affect the water quality of the river for example geographical features of the region and seasonal human activity like tourism. The correlation between different parameters was evaluated using PCA statistical method. This correlation is not stable, and evolves between wet and dry season.
基金Supported by Public Bidding Project of the Guangxi Department of Land and Resources(GXZC2015-G3-0576-GTZB)
文摘Based on related statistical data during 1980-2014,change rule of Guangxi cultivated land pressure level was studied. Taking each municipal administrative division as evaluation unit,temporal-spatial change trend of cultivated land pressure level was explored by establishing pressure index model of cultivated land,and principal component analysis was used to explore the driving force of cultivated land pressure. Results showed that from 1980 to 2014 in Guangxi,cultivated land pressure was at level one in 12 years,level two in 19 years and level three in 4 years; mean of cultivated land pressure in each city during 2005-2014 was taken as average level of cultivated land pressure in the city,in which cultivated land pressure values of Chongzuo City,Baise City,Laibin City,Liuzhou City,Fangchenggang City,Nanning City,Hechi City and Guigang City were all lower than average level in Guangxi at the same period. Driving factors of cultivated land pressure index mainly contained urbanization rate,Engel coefficient of rural households(ECRH),per capita cultivated land area,total population and rural per capita net income(RPFI).
基金financially supported by the National Key Technologies R&D Program of China(Grant NO.2014BAL07B02)the International Science and Technology Cooperation Project(Grant NO.2011DFA22070)the Tourism Youth Expert Training Projects in Sichuan province,China(Grant NO.SCTYETP2017L18)
文摘This paper provides a generalizable mode for the ecological vulnerability evaluation for tourism planning and development in high mountain areas.The Bayi District located in southeastern Tibet is taken as a typical town to study the conflict between the protection of natural ecological environment and the exploitation of tourism resources. Based on the Sensitivity-Recovery-Pressure(SRP) framework, a set of vulnerability evaluation systems for plateau tourism regions were developed. The spatial principal component analysis(SPCA), remote sensing and GIS technologies were integrated to apply for spatial quantification of evaluation index system. The ecological vulnerability of the Bayi District was divided into five levels: potential, mild, moderate,severe, and extreme, and our results showed that significantly severe and extreme vulnerability areas were mainly distributed throughout the southwestern and central northern alpine pasture and glacial zones.Potential and mild vulnerability areas were mainly distributed in the vicinity of the Yarlung Zangbo River tributary basin. Then three tourism development and environmental protection zones were classified and appropriate measures for the protection were proposed. It also provides a reference for the spatial distribution of a range of areas that require different protection measures according to ecological vulnerability classification.
基金The Knowledge Innovation Project of Chinese Academy of Sciences under contract Nos KZCX2-YW-Q07, KZCX2-YW-T001, KZCX2-YW-213 and SQ200805the National Natural Science Foundation of China under contract Nos U0633007, 40906057 and 40531006
文摘Chemometric approach based on principal component analysis(PCA) was utilized to examine the spatial variances of environmental and ecological characteristics in the Zhujiang River(Pearl River) Estuary and adjacent waters(ZREAW) in the South China Sea. The PCA result shows that the ZREAW can be divided into different zones according to the principal components and geographical locations of the study stations,and indicates that there are distinct regional variances on environmental features and the corresponding phytoplankton biomass and community structures among different areas. The spatial distribution of ecological features was implied to be influenced by various degrees of the different water resources,such as the Pearl River discharges,the coastal current and the oceanic water from the South China Sea. The variation of the biomass maximum zone and the complex impacts on the spatial distributions of phytoplankton biomass and production were also evaluated.
基金Project supported by the National Basic Research Program (973) of China(No. 2005CB724205)China Scholarship Programs of the Ministry ofEducation of China (No. 2006100766).
文摘Comprehensive and joint applications of GIS and chemometric approach were applied in identification and spatial patterns of coastal water pollution sources with a large data set (5 years (2000-2004), 17 parameters) obtained through coastal water monitoring of Southern Water Control Zone in Hong Kong. According to cluster analysis the pollution degree was significantly different between September-next May (the 1st period) and June-August (the 2nd period). Based on these results, four potential pollution sources, such as organic/eutrophication pollution, natural pollution, mineral/anthropic pollution and fecal pollution were identified by factor analysis/principal component analysis. Then the factor scores of each monitoring site were analyzed using inverse distance weighting method, and the results indicated degree of the influence by various potential pollution sources differed among the monitoring sites. This study indicated that hybrid approach was useful and effective for identification of coastal water pollution source and spatial patterns.
基金supported by the National Natural Science Foundation of China(Grant No.42002134)China Postdoctoral Science Foundation(Grant No.2021T140735)Science Foundation of China University of Petroleum,Beijing(Grant Nos.2462020XKJS02 and 2462020YXZZ004).
文摘How to fit a properly nonlinear classification model from conventional well logs to lithofacies is a key problem for machine learning methods.Kernel methods(e.g.,KFD,SVM,MSVM)are effective attempts to solve this issue due to abilities of handling nonlinear features by kernel functions.Deep mining of log features indicating lithofacies still needs to be improved for kernel methods.Hence,this work employs deep neural networks to enhance the kernel principal component analysis(KPCA)method and proposes a deep kernel method(DKM)for lithofacies identification using well logs.DKM includes a feature extractor and a classifier.The feature extractor consists of a series of KPCA models arranged according to residual network structure.A gradient-free optimization method is introduced to automatically optimize parameters and structure in DKM,which can avoid complex tuning of parameters in models.To test the validation of the proposed DKM for lithofacies identification,an open-sourced dataset with seven con-ventional logs(GR,CAL,AC,DEN,CNL,LLD,and LLS)and lithofacies labels from the Daniudi Gas Field in China is used.There are eight lithofacies,namely clastic rocks(pebbly,coarse,medium,and fine sand-stone,siltstone,mudstone),coal,and carbonate rocks.The comparisons between DKM and three commonly used kernel methods(KFD,SVM,MSVM)show that(1)DKM(85.7%)outperforms SVM(77%),KFD(79.5%),and MSVM(82.8%)in accuracy of lithofacies identification;(2)DKM is about twice faster than the multi-kernel method(MSVM)with good accuracy.The blind well test in Well D13 indicates that compared with the other three methods DKM improves about 24%in accuracy,35%in precision,41%in recall,and 40%in F1 score,respectively.In general,DKM is an effective method for complex lithofacies identification.This work also discussed the optimal structure and classifier for DKM.Experimental re-sults show that(m_(1),m_(2),O)is the optimal model structure and linear svM is the optimal classifier.(m_(1),m_(2),O)means there are m KPCAs,and then m2 residual units.A workflow to determine an optimal classifier in DKM for lithofacies identification is proposed,too.
文摘Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.
基金jointly supported by the Gansu Provincial Natural Resources Science and Technology Project of the Key Laboratory of Strategic Mineral Resources of the Upper Yellow River,Ministry of Natural Resources(YSJD2022-16)the survey project initiated by the China Geological Survey(DD20211347).
文摘In this paper,25 sampling points of overlying deposits in Tonglushan mining area,Daye City,Hubei Province,China were tested for heavy metal content to explore pollution characteristics,pollution sources and ecological risks of heavy metals in sediments.A geo-accumulation index method was used to evaluate the degree of heavy metal pollution in the sediment.The mean sediment quality guideline quotient was used for evaluating the ecological risk level of heavy metal in the sediment.And a method of correlation analysis,clustering analysis,and principal component analysis was used for preliminary analysis on the source of heavy metal in the sediment.It was indicated that there was extremely heavy metal pollution in the sediment,among which Cd was extremely polluted,Cu strongly contaminated,Zn,As,and Hg moderately contaminated,and Pb,Cr,and Ni were slightly contaminated.It was also indicated by the mean sediment quality guideline-quotient result that there was a high ecological risk of heavy metals in the sediment,and 64%of the sample sites had extremely high hidden biotoxic effects.For distribution,the contamination of branches was worse than that of the main channel of Daye Dagang,and the deposition of each heavy metal was mainly influenced by the distance from this sample site to the sewage draining exit of a tailings pond.The source analysis showed that the heavy metals in the sediment come from pollution discharging of mining and beneficiation companies,tailings ponds,smelting companies,and transport vehicles.In the study area,due to the influence of heavy metal discharging from these sources,the ecotoxicity of heavy metals in the sediment was extremely high,and Cd was the most toxic pollutant.The research figured out the key restoration area and elements for ecological restoration in the sediment of the Tonglüshan mining area,which could be referenced by monitoring and governance of heavy metal pollution in the sediment of the polymetallic mining area.
文摘This paper presents a study on the biotic/abiotic conditions of the S?o Giácomo sanitary landfill, located near the city of Caxias do Sul, Brazil, through statistical analysis of fourteen physic-chemical data sets for the leachate, produced in the garbage dump site over a long period of years. Different chemometric methods are used in the statistical analysis. For example, the correlations between the variables, related to the degraded organic matter and biological activity, are determined by means of multivariate methods. The results highlight that BOD, COD, VTS, FTS and TS give information on the anaerobic degradation of the organic matter contained in the cells, and suggest that the greater the contribution of the variables with positive weights in PC1 the greater the level of organic matter degradation. The variables TN, Amon Nit. and alkalinity are related to the biological activity and determine the potency of the variables in relation to time. The greater the contribution of the variables related to organic degradation the greater the values in PC2 and the lesser the potency of these variables, whose influence is greater in the second stage of anaerobic degradation. The variables of PC2 is important plans of the contamination of the leached in the bodies hídrics.
基金Supported by the National Key R&D Program of China(No.2017YFE0104400)the National Natural Science Foundation of China(Nos.31772852,31802301)the Marine S&T Fund of Shandong Province for Pilot National Laboratory for Marine Science and Technology(Qingdao)(No.2018SDKJ0501-2)。
文摘A habitat model has been widely used to manage marine species and analyze relationship between species distribution and environmental factors.The predictive skill in habitat model depends on whether the models include appropriate explanatory variables.Due to limited habitat range,low density,and low detection rate,the number of zero catches could be very large even in favorable habitats.Excessive zeroes will increase the bias and uncertainty in estimation of habitat.Therefore,appropriate explanatory variables need to be chosen first to prevent underestimate or overestimate species abundance in habitat models.In addition,biotic variables such as prey data and spatial autocovariate(SAC)of target species are often ignored in species distribution models.Therefore,we evaluated the eff ects of input variables on the performance of generalized additive models(GAMs)under excessive zero catch(>70%).Five types of input variables were selected,i.e.,(1)abiotic variables,(2)abiotic and biotic variables,(3)abiotic variables and SAC,(4)abiotic,biotic variables and SAC,and(5)principal component analysis(PCA)based abiotic and biotic variables and SAC.Belanger’s croaker Johnius belangerii is one of the dominant demersal fish in Haizhou Bay,with a large number of zero catches,thus was used for the case study.Results show that the PCA-based GAM incorporated with abiotic and biotic variables and SAC was the most appropriate model to quantify the spatial distribution of the croaker.Biotic variables and SAC were important and should be incorporated as one of the drivers to predict species distribution.Our study suggests that the process of input variables is critical to habitat modelling,which could improve the performance of habitat models and enhance our understanding of the habitat suitability of target species.
基金supported by the Project of the Natural Science Foundation of Liaoning Province(2020-BS-258)the Scientific Research Fund Project of the Educational Department of Liaoning Provincial(LJ2020JCL010)+1 种基金The project was supported by the discipline innovation team of Liaoning Technical University(LNTU20TD-14)the Key Research and Development Project of Heilongjiang Province(GA21A204).
文摘The Heilongjiang Jianbiannongchang area is located at the confluence of the Great and Lesser Xing’an Ranges.This area has a complex magmatic and tectonic evolutionary history that has resulted in a complex and diverse geological background for mineralization.In this study,isometric logarithmic ratio(ILR)transformations of Au,Cu,Pb,Zn,and Sb contents were performed in the1:50,000 soil geochemical data of the Jianbiannongchang area.Robust principal component analysis(RPCA)was conducted based on ILR transformation.The local singularity and spectrum-area(S-A)methods were used to extract information on mineralogic anomalies.The results showed that:(1)the transformed data eliminated the influence of the original data closure effect,and the PC1and PC2 information obtained by applying RPCA reflected ore-producing element anomalies dominated by Au and Cu.(2)The local singularity method can enhance the information of the local strong and weak slow anomalies.After performing local singularity analysis on PC1 and PC2,the obtained local anomalies reflected the local singularity spatial anomaly patterns related to Cu and Au mineralization in this area,which is an effective method for trapping ore-producing anomalies.(3)Furthermore,the composite anomaly decomposition of PC1 and PC2 was performed using the S-A method,and the screened anomalous and background fields reflect the ore-producing anomalies related to Cu and Au mineralization.This information is in agreement with known Cu and Au mineralization.(4)The geochemical anomalies with mineralization potential were obtained outside the known mineralization sites by integrating the information of oreproducing anomalies extracted by the local singularity and S-A methods,providing the theoretical basis and exploration direction for future exploration in the study area.
基金Under the auspices of Natural Science Foundation of China(No.42071342,31870713)Beijing Natural Science Foundation Program(No.8182038)Fundamental Research Funds for the Central Universities(No.2015ZCQ-LX-01,2018ZY06)。
文摘With the continuous development of urbanization in China,the country’s growing population brings great challenges to urban development.By mastering the refined population spatial distribution in administrative units,the quantity and agglomeration of population distribution can be estimated and visualized.It will provide a basis for a more rational urban planning.This paper takes Beijing as the research area and uses a new Luojia1-01 nighttime light image with high resolution,land use type data,Points of Interest(POI)data,and other data to construct the population spatial index system,establishing the index weight based on the principal component analysis.The comprehensive weight value of population distribution in the study area was then used to calculate the street population distribution of Beijing in 2018.Then the population spatial distribution was visualize using GIS technology.After accuracy assessments by comparing the result with the WorldPop data,the accuracy has reached 0.74.The proposed method was validated as a qualified method to generate population spatial maps.By contrast of local areas,Luojia 1-01 data is more suitable for population distribution estimation than the NPP/VIIRS(Net Primary Productivity/Visible infrared Imaging Radiometer)nighttime light data.More geospatial big data and mathematical models can be combined to create more accurate population maps in the future.