Financial time series forecasting could be beneficial for individual as well as institutional investors. But, the high noise and complexity residing in the financial data make this job extremely challenging. Over the ...Financial time series forecasting could be beneficial for individual as well as institutional investors. But, the high noise and complexity residing in the financial data make this job extremely challenging. Over the years, many researchers have used support vector regression (SVR) quite successfully to conquer this challenge. In this paper, an SVR based forecasting model is proposed which first uses the principal component analysis (PCA) to extract the low-dimensional and efficient feature information, and then uses the independent component analysis (ICA) to preprocess the extracted features to nullify the influence of noise in the features. Experiments were carried out based on 16 years’ historical data of three prominent stocks from three different sectors listed in Dhaka Stock Exchange (DSE), Bangladesh. The predictions were made for 1 to 4 days in advance targeting the short term prediction. For comparison, the integration of PCA with SVR (PCA-SVR), ICA with SVR (ICA-SVR) and single SVR approaches were applied to evaluate the prediction accuracy of the proposed approach. Experimental results show that the proposed model (PCA-ICA-SVR) outperforms the PCA-SVR, ICA-SVR and single SVR methods.展开更多
A method for identification of pulsations in time series of magnetic field data which are simultaneously present in multiple channels of data at one or more sensor locations is described. Candidate pulsations of inter...A method for identification of pulsations in time series of magnetic field data which are simultaneously present in multiple channels of data at one or more sensor locations is described. Candidate pulsations of interest are first identified in geomagnetic time series by inspection. Time series of these "training events" are represented in matrix form and transpose-multiplied to generate time- domain covariance matrices. The ranked eigenvectors of this matrix are stored as a feature of the pulsation. In the second stage of the algorithm, a sliding window (approxi- mately the width of the training event) is moved across the vector-valued time-series comprising the channels on which the training event was observed. At each window position, the data covariance matrix and associated eigen- vectors are calculated. We compare the orientation of the dominant eigenvectors of the training data to those from the windowed data and flag windows where the dominant eigenvectors directions are similar. This was successful in automatically identifying pulses which share polarization and appear to be from the same source process. We apply the method to a case study of continuously sampled (50 Hz) data from six observatories, each equipped with three- component induction coil magnetometers. We examine a 90-day interval of data associated with a cluster of four observatories located within 50 km of Napa, California, together with two remote reference stations-one 100 km to the north of the cluster and the other 350 km south. When the training data contains signals present in the remote reference observatories, we are reliably able to identify and extract global geomagnetic signals such as solar-generated noise. When training data contains pulsations only observed in the cluster of local observatories, we identify several types of non-plane wave signals having similar polarization.展开更多
This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a c...This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a convex approximation of the rank operator under mild condition. However, most nuclear norm minimization approaches are based on SVD operations. Given a matrix , the time complexity of SVD operation is O(mn2), which brings prohibitive computational complexity in large-scale problems. In this paper, an efficient and scalable algorithm for tensor principal component analysis is proposed which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis (LADMVTPCA). Different from traditional matrix factorization methods, LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors, which greatly improves the computational efficacy compared to matrix factorization method. In the experiment part, synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA. Results have shown that LADMVTPCA outperforms matrix factorization based method.展开更多
Experimental and theoretical studies of the mechanisms of vibration stimulation of oil recovery in watered fields lead to the conclusion that resonance oscillations develop in fractured-block formations. These oscilla...Experimental and theoretical studies of the mechanisms of vibration stimulation of oil recovery in watered fields lead to the conclusion that resonance oscillations develop in fractured-block formations. These oscillations, caused by weak but long-lasting and frequency-stable influences, create the conditions for ultrasonic wave’s generation in the layers, which are capable of destroying thickened oil membranes in reservoir cracks. For fractured-porous reservoirs in the process of exploitation by the method of water high-pressure oil displacement, the possibility of intensifying ultrasonic vibrations can have an important technological significance. Even a very weak ultrasound can destroy, over a long period of time, the viscous oil membranes formed in the cracks between the blocks, which can be the reason for lowering the permeability of the layers and increasing the oil recovery. To describe these effects, it is necessary to consider the wave process in a hierarchically blocky environment and theoretically simulate the mechanism of the appearance of self-oscillations under the action of relaxation shear stresses. For the analysis of seism acoustic response in time on fixed intervals along the borehole an algorithm of phase diagrams of the state of many-phase medium is suggested.展开更多
The Heilongjiang Jianbiannongchang area is located at the confluence of the Great and Lesser Xing’an Ranges.This area has a complex magmatic and tectonic evolutionary history that has resulted in a complex and divers...The Heilongjiang Jianbiannongchang area is located at the confluence of the Great and Lesser Xing’an Ranges.This area has a complex magmatic and tectonic evolutionary history that has resulted in a complex and diverse geological background for mineralization.In this study,isometric logarithmic ratio(ILR)transformations of Au,Cu,Pb,Zn,and Sb contents were performed in the1:50,000 soil geochemical data of the Jianbiannongchang area.Robust principal component analysis(RPCA)was conducted based on ILR transformation.The local singularity and spectrum-area(S-A)methods were used to extract information on mineralogic anomalies.The results showed that:(1)the transformed data eliminated the influence of the original data closure effect,and the PC1and PC2 information obtained by applying RPCA reflected ore-producing element anomalies dominated by Au and Cu.(2)The local singularity method can enhance the information of the local strong and weak slow anomalies.After performing local singularity analysis on PC1 and PC2,the obtained local anomalies reflected the local singularity spatial anomaly patterns related to Cu and Au mineralization in this area,which is an effective method for trapping ore-producing anomalies.(3)Furthermore,the composite anomaly decomposition of PC1 and PC2 was performed using the S-A method,and the screened anomalous and background fields reflect the ore-producing anomalies related to Cu and Au mineralization.This information is in agreement with known Cu and Au mineralization.(4)The geochemical anomalies with mineralization potential were obtained outside the known mineralization sites by integrating the information of oreproducing anomalies extracted by the local singularity and S-A methods,providing the theoretical basis and exploration direction for future exploration in the study area.展开更多
Discrete element method(DEM)has been widely utilised to model the mechanical behaviours of granular materials.However,with simplified particle morphology or rheology-based rolling resistance models,DEM failed to descr...Discrete element method(DEM)has been widely utilised to model the mechanical behaviours of granular materials.However,with simplified particle morphology or rheology-based rolling resistance models,DEM failed to describe some responses,such as the particle kinematics at the grain-scale and the principal stress ratio against axial strain at the macro-scale.This paper adopts a computed tomography(CT)-based DEM technique,including particle morphology data acquisition from micro-CT(mCT),spherical harmonic-based principal component analysis(SH-PCA)-based particle morphology reconstruction and DEM simulations,to investigate the capability of DEM with realistic particle morphology for modelling granular soils’micro-macro mechanical responses with a consideration of the initial packing state,the morphological gene mutation degree,and the confining stress condition.It is found that DEM with realistic particle morphology can reasonably reproduce granular materials’micro-macro mechanical behaviours,including the deviatoric stressevolumetric straineaxial strain response,critical state behaviour,particle kinematics,and shear band evolution.Meanwhile,the role of multiscale particle morphology in granular soils depends on the initial packing state and the confining stress condition.For the same granular soils,rougher particle surfaces with a denser initial packing state and a higher confining stress condition result in a higher degree of shear strain localisation.展开更多
In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different ...In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different sources are collected at different sampling rates.To build a complete process monitoring strategy,all these multi-rate measurements should be considered for data-based modeling and monitoring.In this paper,a novel kernel multi-rate probabilistic principal component analysis(K-MPPCA)model is proposed to extract the nonlinear correlations among different sampling rates.In the proposed model,the model parameters are calibrated using the kernel trick and the expectation-maximum(EM)algorithm.Also,the corresponding fault detection methods based on the nonlinear features are developed.Finally,a simulated nonlinear case and an actual pre-decarburization unit in the ammonia synthesis process are tested to demonstrate the efficiency of the proposed method.展开更多
In order to directly construct the mapping between multiple state parameters and remaining useful life(RUL),and reduce the interference of random error on prediction accuracy,a RUL prediction model of aeroengine based...In order to directly construct the mapping between multiple state parameters and remaining useful life(RUL),and reduce the interference of random error on prediction accuracy,a RUL prediction model of aeroengine based on principal component analysis(PCA)and one-dimensional convolution neural network(1D-CNN)is proposed in this paper.Firstly,multiple state parameters corresponding to massive cycles of aeroengine are collected and brought into PCA for dimensionality reduction,and principal components are extracted for further time series prediction.Secondly,the 1D-CNN model is constructed to directly study the mapping between principal components and RUL.Multiple convolution and pooling operations are applied for deep feature extraction,and the end-to-end RUL prediction of aeroengine can be realized.Experimental results show that the most effective principal component from the multiple state parameters can be obtained by PCA,and the long time series of multiple state parameters can be directly mapped to RUL by 1D-CNN,so as to improve the efficiency and accuracy of RUL prediction.Compared with other traditional models,the proposed method also has lower prediction error and better robustness.展开更多
Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input da...Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.展开更多
On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feat...On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feature extraction process integrating a forward rolling empirical mode decomposition(EMD)for financial time series signal analysis and principal component analysis(PCA)for the dimension reduction.The information-rich features are extracted then input to a weighted KNN classifier where the features are weighted with PCA loading.Finally,prediction is generated via regression on the selected nearest neighbors.The structure of the model as a whole is original.The test results on real historical data sets confirm the effectiveness of the models for predicting the Chinese stock index,an individual stock,and the EUR/USD exchange rate.展开更多
This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverag...This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverage time, the frequency of coverage, average time per coverage and maximum coverage gap, etc. is required simultaneously. By introducing index normalization method to convert performance indices into dimensionless variables within the range of [0, 1], a design optimization method based on the principal component analysis and cluster analysis is proposed, which consists of index normalization method, principal component analysis, multiple-level cluster analysis and weighted evaluation method. The results of orbit optimization for earth observation satellites show that the optimal orbit can be obtained by using the proposed method. The principal component analysis can reduce the total number of indices with a non-independent relationship to save computing time. Similarly, the multiple-level cluster analysis with parallel computing could save computing time.展开更多
In order to study the water quality of the Shichuan River basin in Fuping,Shaanxi Province,based on improved Nemerow index method,comprehensive pollution index method and principal component analysis method,eight wate...In order to study the water quality of the Shichuan River basin in Fuping,Shaanxi Province,based on improved Nemerow index method,comprehensive pollution index method and principal component analysis method,eight water quality indexes such as pH,dissolved oxygen(DO),total dissolved solids(TDS),COD,total hardness,total phosphorus,total nitrogen and Zn in three monitoring sections of Fuping section of the Shichuan River in Shaanxi Province were detected and analyzed.The results show that the water quality of the surface water in the Shichuan River basin is gradeⅢorⅣwater,that is,the water is slightly polluted and moderately polluted.It is necessary to monitor the water quality after regulation and clarify the main factors causing the water pollution.展开更多
A generalized, structural, time series modeling framework was developed to analyze the monthly records of absolute surface temperature, one of the most important environmental parameters, using a deterministicstochast...A generalized, structural, time series modeling framework was developed to analyze the monthly records of absolute surface temperature, one of the most important environmental parameters, using a deterministicstochastic combined (DSC) approach. Although the development of the framework was based on the characterization of the variation patterns of a global dataset, the methodology could be applied to any monthly absolute temperature record. Deterministic processes were used to characterize the variation patterns of the global trend and the cyclic oscillations of the temperature signal, involving polynomial functions and the Fourier method, respectively, while stochastic processes were employed to account for any remaining patterns in the temperature signal, involving seasonal autoregressive integrated moving average (SARIMA) models. A prediction of the monthly global surface temperature during the second decade of the 21st century using the DSC model shows that the global temperature will likely continue to rise at twice the average rate of the past 150 years. The evaluation of prediction accuracy shows that DSC models perform systematically well against selected models of other authors, suggesting that DSC models, when coupled with other ecoenvironmental models, can be used as a supplemental tool for short-term (10-year) environmental planning and decision making.展开更多
This study aimed to explore the application of surface-enhanced Raman scattering(SERS)in the rapid diagnosis of gastric cancer.The SERS spectra of 68 serum samples from gastric cancer patients and healthy volunteers w...This study aimed to explore the application of surface-enhanced Raman scattering(SERS)in the rapid diagnosis of gastric cancer.The SERS spectra of 68 serum samples from gastric cancer patients and healthy volunteers were acquired.The characteristic ratio method(CRM)and principal component analysis(PCA)were used to differentiate gastric cancer serum from normal serum.Compared with healthy volunteers,the serum SERS intensity of gastric cancer patients was relatively high at 722 cm^(-1),while it was relatively low at 588,644,861,1008,1235,1397,1445 and 1586 cm^(-1).These results indicated that the relative content of nucleic acids in the serum of gastric cancer patients rises while the relative content of amino acids and carbohydrates decreases.In PCA,the sensitivity and specificity of discriminating gastric cancer were 94.1%and 94.1%,respectively,with the accuracy of 94.1%.Based on the intensity ratios of four characteristic peaks at 722,861,1008 and 1397 cm^(-1),CRM presented the diagnostic sensitivity and specificity of 100%and 97.4%,respectively,and the accuracy of 98.5%.Therefore,the three peak intensity ratios of I_(722)/I_(861),I_(722)/I_(1008)and I_(722)/I_(1397)can be considered as biologicalfingerprint information for gastric cancer diagnosis and can rapidly and directly reflect the physiological and pathological changes associated with gastric cancer development.This study provides an important basis and standards for the early diagnosis of gastric cancer.展开更多
In this paper,25 sampling points of overlying deposits in Tonglushan mining area,Daye City,Hubei Province,China were tested for heavy metal content to explore pollution characteristics,pollution sources and ecological...In this paper,25 sampling points of overlying deposits in Tonglushan mining area,Daye City,Hubei Province,China were tested for heavy metal content to explore pollution characteristics,pollution sources and ecological risks of heavy metals in sediments.A geo-accumulation index method was used to evaluate the degree of heavy metal pollution in the sediment.The mean sediment quality guideline quotient was used for evaluating the ecological risk level of heavy metal in the sediment.And a method of correlation analysis,clustering analysis,and principal component analysis was used for preliminary analysis on the source of heavy metal in the sediment.It was indicated that there was extremely heavy metal pollution in the sediment,among which Cd was extremely polluted,Cu strongly contaminated,Zn,As,and Hg moderately contaminated,and Pb,Cr,and Ni were slightly contaminated.It was also indicated by the mean sediment quality guideline-quotient result that there was a high ecological risk of heavy metals in the sediment,and 64%of the sample sites had extremely high hidden biotoxic effects.For distribution,the contamination of branches was worse than that of the main channel of Daye Dagang,and the deposition of each heavy metal was mainly influenced by the distance from this sample site to the sewage draining exit of a tailings pond.The source analysis showed that the heavy metals in the sediment come from pollution discharging of mining and beneficiation companies,tailings ponds,smelting companies,and transport vehicles.In the study area,due to the influence of heavy metal discharging from these sources,the ecotoxicity of heavy metals in the sediment was extremely high,and Cd was the most toxic pollutant.The research figured out the key restoration area and elements for ecological restoration in the sediment of the Tonglüshan mining area,which could be referenced by monitoring and governance of heavy metal pollution in the sediment of the polymetallic mining area.展开更多
Currently, some fault prognosis technology occasionally has relatively unsatisfied performance especially for in- cipient faults in nonlinear processes duo to their large time delay and complex internal connection. To...Currently, some fault prognosis technology occasionally has relatively unsatisfied performance especially for in- cipient faults in nonlinear processes duo to their large time delay and complex internal connection. To overcome this deficiency, multivariate time delay analysis is incorporated into the high sensitive local kernel principal component analysis. In this approach, mutual information estimation and Bayesian information criterion (BIC) are separately used to acquire the correlation degree and time delay of the process variables. Moreover, in order to achieve prediction, time series prediction by back propagation (BP) network is applied whose input is multivar- iate correlated time series other than the original time series. Then the multivariate time delayed series and future values obtained by time series prediction are combined to construct the input of local kernel principal component analysis (LKPCA) model for incipient fault prognosis. The new method has been exemplified in a sim- ple nonlinear process and the complicated Tennessee Eastman (TE) benchmark process. The results indicate that the new method has suoerioritv in the fault prognosis sensitivity over other traditional fault prognosis methods.展开更多
Background: Breast cancer is the most common female cancer in Pakistan. The incidence of breast cancer in Pakistan is about 2.5 times higher than that in the neighboring countries India and Iran. In Karachi, the most...Background: Breast cancer is the most common female cancer in Pakistan. The incidence of breast cancer in Pakistan is about 2.5 times higher than that in the neighboring countries India and Iran. In Karachi, the most populated city of Pakistan, the age-standardized rate of breast cancer was 69.1 per 100,000 women during 1998-2002, which is the highest recorded rate in Asia. The carcinoma of breast in Pakistan is an enormous public health concern. In this study, we examined the recent trends of breast cancer incidence rates among the women in Karachi. Methods: We obtained the secondary data of breast cancer incidence from various hospitals. They included Jinnah Hospital, KIRAN (Karachi Institute of Radiotherapy and Nuclear Medicine), and Civil hospital, where the data were available for the years 2004-2011. A total of 5331 new cases of female breast cancer were registered during this period. We analyzed the data in 5-year age groups 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59, 60-64, 65-69, 70-74, 75+. Nonparametric smoothing were used to obtained age-specific incidence curves, and then the curves are decomposed using principal components analysis to fit FTS (functional time series) model. We then used exponential smoothing statspace models to estimate the forecasts of incidence curve and construct prediction intervals. Results: The breast cancer incidence rates in Karachi increased with age for all available years. The rates increased monotonically and are relatively sharp with the age from 15 years to 50 years and then they show variability after the age of 50 years. 10-year forecasts for the female breast cancer incidence rates in Karachi show that the future rates are expected to remain stable for the age-groups 15-50 years, but they will increase for the females of 50-years and over. Hence in future, the newly diagnosed breast cancer cases in the older women in Karachi are expected to increase. Conclusion: Prediction of age related changes in breast cancer incidence rates will provide useful information for controlling the overall burden of cancer in Pakistan and also serve as a resource for health planning in future research. Moreover, these models will be the most useful for modeling and projecting future trends of other cancers and chronic diseases.展开更多
How to fit a properly nonlinear classification model from conventional well logs to lithofacies is a key problem for machine learning methods.Kernel methods(e.g.,KFD,SVM,MSVM)are effective attempts to solve this issue...How to fit a properly nonlinear classification model from conventional well logs to lithofacies is a key problem for machine learning methods.Kernel methods(e.g.,KFD,SVM,MSVM)are effective attempts to solve this issue due to abilities of handling nonlinear features by kernel functions.Deep mining of log features indicating lithofacies still needs to be improved for kernel methods.Hence,this work employs deep neural networks to enhance the kernel principal component analysis(KPCA)method and proposes a deep kernel method(DKM)for lithofacies identification using well logs.DKM includes a feature extractor and a classifier.The feature extractor consists of a series of KPCA models arranged according to residual network structure.A gradient-free optimization method is introduced to automatically optimize parameters and structure in DKM,which can avoid complex tuning of parameters in models.To test the validation of the proposed DKM for lithofacies identification,an open-sourced dataset with seven con-ventional logs(GR,CAL,AC,DEN,CNL,LLD,and LLS)and lithofacies labels from the Daniudi Gas Field in China is used.There are eight lithofacies,namely clastic rocks(pebbly,coarse,medium,and fine sand-stone,siltstone,mudstone),coal,and carbonate rocks.The comparisons between DKM and three commonly used kernel methods(KFD,SVM,MSVM)show that(1)DKM(85.7%)outperforms SVM(77%),KFD(79.5%),and MSVM(82.8%)in accuracy of lithofacies identification;(2)DKM is about twice faster than the multi-kernel method(MSVM)with good accuracy.The blind well test in Well D13 indicates that compared with the other three methods DKM improves about 24%in accuracy,35%in precision,41%in recall,and 40%in F1 score,respectively.In general,DKM is an effective method for complex lithofacies identification.This work also discussed the optimal structure and classifier for DKM.Experimental re-sults show that(m_(1),m_(2),O)is the optimal model structure and linear svM is the optimal classifier.(m_(1),m_(2),O)means there are m KPCAs,and then m2 residual units.A workflow to determine an optimal classifier in DKM for lithofacies identification is proposed,too.展开更多
文摘Financial time series forecasting could be beneficial for individual as well as institutional investors. But, the high noise and complexity residing in the financial data make this job extremely challenging. Over the years, many researchers have used support vector regression (SVR) quite successfully to conquer this challenge. In this paper, an SVR based forecasting model is proposed which first uses the principal component analysis (PCA) to extract the low-dimensional and efficient feature information, and then uses the independent component analysis (ICA) to preprocess the extracted features to nullify the influence of noise in the features. Experiments were carried out based on 16 years’ historical data of three prominent stocks from three different sectors listed in Dhaka Stock Exchange (DSE), Bangladesh. The predictions were made for 1 to 4 days in advance targeting the short term prediction. For comparison, the integration of PCA with SVR (PCA-SVR), ICA with SVR (ICA-SVR) and single SVR approaches were applied to evaluate the prediction accuracy of the proposed approach. Experimental results show that the proposed model (PCA-ICA-SVR) outperforms the PCA-SVR, ICA-SVR and single SVR methods.
文摘A method for identification of pulsations in time series of magnetic field data which are simultaneously present in multiple channels of data at one or more sensor locations is described. Candidate pulsations of interest are first identified in geomagnetic time series by inspection. Time series of these "training events" are represented in matrix form and transpose-multiplied to generate time- domain covariance matrices. The ranked eigenvectors of this matrix are stored as a feature of the pulsation. In the second stage of the algorithm, a sliding window (approxi- mately the width of the training event) is moved across the vector-valued time-series comprising the channels on which the training event was observed. At each window position, the data covariance matrix and associated eigen- vectors are calculated. We compare the orientation of the dominant eigenvectors of the training data to those from the windowed data and flag windows where the dominant eigenvectors directions are similar. This was successful in automatically identifying pulses which share polarization and appear to be from the same source process. We apply the method to a case study of continuously sampled (50 Hz) data from six observatories, each equipped with three- component induction coil magnetometers. We examine a 90-day interval of data associated with a cluster of four observatories located within 50 km of Napa, California, together with two remote reference stations-one 100 km to the north of the cluster and the other 350 km south. When the training data contains signals present in the remote reference observatories, we are reliably able to identify and extract global geomagnetic signals such as solar-generated noise. When training data contains pulsations only observed in the cluster of local observatories, we identify several types of non-plane wave signals having similar polarization.
文摘This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a convex approximation of the rank operator under mild condition. However, most nuclear norm minimization approaches are based on SVD operations. Given a matrix , the time complexity of SVD operation is O(mn2), which brings prohibitive computational complexity in large-scale problems. In this paper, an efficient and scalable algorithm for tensor principal component analysis is proposed which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis (LADMVTPCA). Different from traditional matrix factorization methods, LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors, which greatly improves the computational efficacy compared to matrix factorization method. In the experiment part, synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA. Results have shown that LADMVTPCA outperforms matrix factorization based method.
文摘Experimental and theoretical studies of the mechanisms of vibration stimulation of oil recovery in watered fields lead to the conclusion that resonance oscillations develop in fractured-block formations. These oscillations, caused by weak but long-lasting and frequency-stable influences, create the conditions for ultrasonic wave’s generation in the layers, which are capable of destroying thickened oil membranes in reservoir cracks. For fractured-porous reservoirs in the process of exploitation by the method of water high-pressure oil displacement, the possibility of intensifying ultrasonic vibrations can have an important technological significance. Even a very weak ultrasound can destroy, over a long period of time, the viscous oil membranes formed in the cracks between the blocks, which can be the reason for lowering the permeability of the layers and increasing the oil recovery. To describe these effects, it is necessary to consider the wave process in a hierarchically blocky environment and theoretically simulate the mechanism of the appearance of self-oscillations under the action of relaxation shear stresses. For the analysis of seism acoustic response in time on fixed intervals along the borehole an algorithm of phase diagrams of the state of many-phase medium is suggested.
基金supported by the Project of the Natural Science Foundation of Liaoning Province(2020-BS-258)the Scientific Research Fund Project of the Educational Department of Liaoning Provincial(LJ2020JCL010)+1 种基金The project was supported by the discipline innovation team of Liaoning Technical University(LNTU20TD-14)the Key Research and Development Project of Heilongjiang Province(GA21A204).
文摘The Heilongjiang Jianbiannongchang area is located at the confluence of the Great and Lesser Xing’an Ranges.This area has a complex magmatic and tectonic evolutionary history that has resulted in a complex and diverse geological background for mineralization.In this study,isometric logarithmic ratio(ILR)transformations of Au,Cu,Pb,Zn,and Sb contents were performed in the1:50,000 soil geochemical data of the Jianbiannongchang area.Robust principal component analysis(RPCA)was conducted based on ILR transformation.The local singularity and spectrum-area(S-A)methods were used to extract information on mineralogic anomalies.The results showed that:(1)the transformed data eliminated the influence of the original data closure effect,and the PC1and PC2 information obtained by applying RPCA reflected ore-producing element anomalies dominated by Au and Cu.(2)The local singularity method can enhance the information of the local strong and weak slow anomalies.After performing local singularity analysis on PC1 and PC2,the obtained local anomalies reflected the local singularity spatial anomaly patterns related to Cu and Au mineralization in this area,which is an effective method for trapping ore-producing anomalies.(3)Furthermore,the composite anomaly decomposition of PC1 and PC2 was performed using the S-A method,and the screened anomalous and background fields reflect the ore-producing anomalies related to Cu and Au mineralization.This information is in agreement with known Cu and Au mineralization.(4)The geochemical anomalies with mineralization potential were obtained outside the known mineralization sites by integrating the information of oreproducing anomalies extracted by the local singularity and S-A methods,providing the theoretical basis and exploration direction for future exploration in the study area.
基金supported by the General Research Fund from the Research Grant Council of the Hong Kong SAR,China(Grant Nos.CityU 11201020 and CityU 11207321)the National Science Foundation of China(Grant No.42207185)+1 种基金the Contract Research Project from the Geotechnical Engineering Office of the Civil Engineering Development Department of Hong Kong SAR,China(Project Ref.No.CEDD STD-30-2030-1-12R)the BL13W beamline of Shanghai Synchrotron Radiation Facility(SSRF)。
文摘Discrete element method(DEM)has been widely utilised to model the mechanical behaviours of granular materials.However,with simplified particle morphology or rheology-based rolling resistance models,DEM failed to describe some responses,such as the particle kinematics at the grain-scale and the principal stress ratio against axial strain at the macro-scale.This paper adopts a computed tomography(CT)-based DEM technique,including particle morphology data acquisition from micro-CT(mCT),spherical harmonic-based principal component analysis(SH-PCA)-based particle morphology reconstruction and DEM simulations,to investigate the capability of DEM with realistic particle morphology for modelling granular soils’micro-macro mechanical responses with a consideration of the initial packing state,the morphological gene mutation degree,and the confining stress condition.It is found that DEM with realistic particle morphology can reasonably reproduce granular materials’micro-macro mechanical behaviours,including the deviatoric stressevolumetric straineaxial strain response,critical state behaviour,particle kinematics,and shear band evolution.Meanwhile,the role of multiscale particle morphology in granular soils depends on the initial packing state and the confining stress condition.For the same granular soils,rougher particle surfaces with a denser initial packing state and a higher confining stress condition result in a higher degree of shear strain localisation.
基金supported by Zhejiang Provincial Natural Science Foundation of China(LY19F030003)Key Research and Development Project of Zhejiang Province(2021C04030)+1 种基金the National Natural Science Foundation of China(62003306)Educational Commission Research Program of Zhejiang Province(Y202044842)。
文摘In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different sources are collected at different sampling rates.To build a complete process monitoring strategy,all these multi-rate measurements should be considered for data-based modeling and monitoring.In this paper,a novel kernel multi-rate probabilistic principal component analysis(K-MPPCA)model is proposed to extract the nonlinear correlations among different sampling rates.In the proposed model,the model parameters are calibrated using the kernel trick and the expectation-maximum(EM)algorithm.Also,the corresponding fault detection methods based on the nonlinear features are developed.Finally,a simulated nonlinear case and an actual pre-decarburization unit in the ammonia synthesis process are tested to demonstrate the efficiency of the proposed method.
基金supported by Jiangsu Social Science Foundation(No.20GLD008)Science,Technology Projects of Jiangsu Provincial Department of Communications(No.2020Y14)Joint Fund for Civil Aviation Research(No.U1933202)。
文摘In order to directly construct the mapping between multiple state parameters and remaining useful life(RUL),and reduce the interference of random error on prediction accuracy,a RUL prediction model of aeroengine based on principal component analysis(PCA)and one-dimensional convolution neural network(1D-CNN)is proposed in this paper.Firstly,multiple state parameters corresponding to massive cycles of aeroengine are collected and brought into PCA for dimensionality reduction,and principal components are extracted for further time series prediction.Secondly,the 1D-CNN model is constructed to directly study the mapping between principal components and RUL.Multiple convolution and pooling operations are applied for deep feature extraction,and the end-to-end RUL prediction of aeroengine can be realized.Experimental results show that the most effective principal component from the multiple state parameters can be obtained by PCA,and the long time series of multiple state parameters can be directly mapped to RUL by 1D-CNN,so as to improve the efficiency and accuracy of RUL prediction.Compared with other traditional models,the proposed method also has lower prediction error and better robustness.
文摘Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.
基金supported by the Social Science Foundation of China under Grant No.17BGL231。
文摘On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feature extraction process integrating a forward rolling empirical mode decomposition(EMD)for financial time series signal analysis and principal component analysis(PCA)for the dimension reduction.The information-rich features are extracted then input to a weighted KNN classifier where the features are weighted with PCA loading.Finally,prediction is generated via regression on the selected nearest neighbors.The structure of the model as a whole is original.The test results on real historical data sets confirm the effectiveness of the models for predicting the Chinese stock index,an individual stock,and the EUR/USD exchange rate.
基金Funded by 973 Program of Ministry of National Defense of China(Grant No.613237)
文摘This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverage time, the frequency of coverage, average time per coverage and maximum coverage gap, etc. is required simultaneously. By introducing index normalization method to convert performance indices into dimensionless variables within the range of [0, 1], a design optimization method based on the principal component analysis and cluster analysis is proposed, which consists of index normalization method, principal component analysis, multiple-level cluster analysis and weighted evaluation method. The results of orbit optimization for earth observation satellites show that the optimal orbit can be obtained by using the proposed method. The principal component analysis can reduce the total number of indices with a non-independent relationship to save computing time. Similarly, the multiple-level cluster analysis with parallel computing could save computing time.
基金Supported by the National Natural Science Foundation of China(41901012)Project of Shaanxi Provincial Education Department(21JP040)+1 种基金Talent Fund Project of Weinan Normal University(2021RC04)National Innovation and Entrepreneurship Training Program for College Students(22XK019)。
文摘In order to study the water quality of the Shichuan River basin in Fuping,Shaanxi Province,based on improved Nemerow index method,comprehensive pollution index method and principal component analysis method,eight water quality indexes such as pH,dissolved oxygen(DO),total dissolved solids(TDS),COD,total hardness,total phosphorus,total nitrogen and Zn in three monitoring sections of Fuping section of the Shichuan River in Shaanxi Province were detected and analyzed.The results show that the water quality of the surface water in the Shichuan River basin is gradeⅢorⅣwater,that is,the water is slightly polluted and moderately polluted.It is necessary to monitor the water quality after regulation and clarify the main factors causing the water pollution.
基金This research was supported by the Ministry of Science and Technology of China,National Basic Research Program of China (Grant No.2010CB951504).The authors acknowledge support from the Flemish Interuniversity Council,the Ghent University Laboratory of Soil Science for the writing of this paper
文摘A generalized, structural, time series modeling framework was developed to analyze the monthly records of absolute surface temperature, one of the most important environmental parameters, using a deterministicstochastic combined (DSC) approach. Although the development of the framework was based on the characterization of the variation patterns of a global dataset, the methodology could be applied to any monthly absolute temperature record. Deterministic processes were used to characterize the variation patterns of the global trend and the cyclic oscillations of the temperature signal, involving polynomial functions and the Fourier method, respectively, while stochastic processes were employed to account for any remaining patterns in the temperature signal, involving seasonal autoregressive integrated moving average (SARIMA) models. A prediction of the monthly global surface temperature during the second decade of the 21st century using the DSC model shows that the global temperature will likely continue to rise at twice the average rate of the past 150 years. The evaluation of prediction accuracy shows that DSC models perform systematically well against selected models of other authors, suggesting that DSC models, when coupled with other ecoenvironmental models, can be used as a supplemental tool for short-term (10-year) environmental planning and decision making.
基金This work was supported by the Natural Science Foundation of Guangdong Province,China(2018 A0303131000)the project of Academician workstation of Guangdong Province,China(2014B090905001)the Fundamental Research Funds for the Central Universities,China(21617406)and the key project of Scientific and Technological projects of Guang Zhou,China(201604040007,201604020168).
文摘This study aimed to explore the application of surface-enhanced Raman scattering(SERS)in the rapid diagnosis of gastric cancer.The SERS spectra of 68 serum samples from gastric cancer patients and healthy volunteers were acquired.The characteristic ratio method(CRM)and principal component analysis(PCA)were used to differentiate gastric cancer serum from normal serum.Compared with healthy volunteers,the serum SERS intensity of gastric cancer patients was relatively high at 722 cm^(-1),while it was relatively low at 588,644,861,1008,1235,1397,1445 and 1586 cm^(-1).These results indicated that the relative content of nucleic acids in the serum of gastric cancer patients rises while the relative content of amino acids and carbohydrates decreases.In PCA,the sensitivity and specificity of discriminating gastric cancer were 94.1%and 94.1%,respectively,with the accuracy of 94.1%.Based on the intensity ratios of four characteristic peaks at 722,861,1008 and 1397 cm^(-1),CRM presented the diagnostic sensitivity and specificity of 100%and 97.4%,respectively,and the accuracy of 98.5%.Therefore,the three peak intensity ratios of I_(722)/I_(861),I_(722)/I_(1008)and I_(722)/I_(1397)can be considered as biologicalfingerprint information for gastric cancer diagnosis and can rapidly and directly reflect the physiological and pathological changes associated with gastric cancer development.This study provides an important basis and standards for the early diagnosis of gastric cancer.
基金jointly supported by the Gansu Provincial Natural Resources Science and Technology Project of the Key Laboratory of Strategic Mineral Resources of the Upper Yellow River,Ministry of Natural Resources(YSJD2022-16)the survey project initiated by the China Geological Survey(DD20211347).
文摘In this paper,25 sampling points of overlying deposits in Tonglushan mining area,Daye City,Hubei Province,China were tested for heavy metal content to explore pollution characteristics,pollution sources and ecological risks of heavy metals in sediments.A geo-accumulation index method was used to evaluate the degree of heavy metal pollution in the sediment.The mean sediment quality guideline quotient was used for evaluating the ecological risk level of heavy metal in the sediment.And a method of correlation analysis,clustering analysis,and principal component analysis was used for preliminary analysis on the source of heavy metal in the sediment.It was indicated that there was extremely heavy metal pollution in the sediment,among which Cd was extremely polluted,Cu strongly contaminated,Zn,As,and Hg moderately contaminated,and Pb,Cr,and Ni were slightly contaminated.It was also indicated by the mean sediment quality guideline-quotient result that there was a high ecological risk of heavy metals in the sediment,and 64%of the sample sites had extremely high hidden biotoxic effects.For distribution,the contamination of branches was worse than that of the main channel of Daye Dagang,and the deposition of each heavy metal was mainly influenced by the distance from this sample site to the sewage draining exit of a tailings pond.The source analysis showed that the heavy metals in the sediment come from pollution discharging of mining and beneficiation companies,tailings ponds,smelting companies,and transport vehicles.In the study area,due to the influence of heavy metal discharging from these sources,the ecotoxicity of heavy metals in the sediment was extremely high,and Cd was the most toxic pollutant.The research figured out the key restoration area and elements for ecological restoration in the sediment of the Tonglüshan mining area,which could be referenced by monitoring and governance of heavy metal pollution in the sediment of the polymetallic mining area.
基金Supported by the National Natural Science Foundation of China(61573051,61472021)the Natural Science Foundation of Beijing(4142039)+1 种基金Open Fund of the State Key Laboratory of Software Development Environment(SKLSDE-2015KF-01)Fundamental Research Funds for the Central Universities(PT1613-05)
文摘Currently, some fault prognosis technology occasionally has relatively unsatisfied performance especially for in- cipient faults in nonlinear processes duo to their large time delay and complex internal connection. To overcome this deficiency, multivariate time delay analysis is incorporated into the high sensitive local kernel principal component analysis. In this approach, mutual information estimation and Bayesian information criterion (BIC) are separately used to acquire the correlation degree and time delay of the process variables. Moreover, in order to achieve prediction, time series prediction by back propagation (BP) network is applied whose input is multivar- iate correlated time series other than the original time series. Then the multivariate time delayed series and future values obtained by time series prediction are combined to construct the input of local kernel principal component analysis (LKPCA) model for incipient fault prognosis. The new method has been exemplified in a sim- ple nonlinear process and the complicated Tennessee Eastman (TE) benchmark process. The results indicate that the new method has suoerioritv in the fault prognosis sensitivity over other traditional fault prognosis methods.
文摘Background: Breast cancer is the most common female cancer in Pakistan. The incidence of breast cancer in Pakistan is about 2.5 times higher than that in the neighboring countries India and Iran. In Karachi, the most populated city of Pakistan, the age-standardized rate of breast cancer was 69.1 per 100,000 women during 1998-2002, which is the highest recorded rate in Asia. The carcinoma of breast in Pakistan is an enormous public health concern. In this study, we examined the recent trends of breast cancer incidence rates among the women in Karachi. Methods: We obtained the secondary data of breast cancer incidence from various hospitals. They included Jinnah Hospital, KIRAN (Karachi Institute of Radiotherapy and Nuclear Medicine), and Civil hospital, where the data were available for the years 2004-2011. A total of 5331 new cases of female breast cancer were registered during this period. We analyzed the data in 5-year age groups 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59, 60-64, 65-69, 70-74, 75+. Nonparametric smoothing were used to obtained age-specific incidence curves, and then the curves are decomposed using principal components analysis to fit FTS (functional time series) model. We then used exponential smoothing statspace models to estimate the forecasts of incidence curve and construct prediction intervals. Results: The breast cancer incidence rates in Karachi increased with age for all available years. The rates increased monotonically and are relatively sharp with the age from 15 years to 50 years and then they show variability after the age of 50 years. 10-year forecasts for the female breast cancer incidence rates in Karachi show that the future rates are expected to remain stable for the age-groups 15-50 years, but they will increase for the females of 50-years and over. Hence in future, the newly diagnosed breast cancer cases in the older women in Karachi are expected to increase. Conclusion: Prediction of age related changes in breast cancer incidence rates will provide useful information for controlling the overall burden of cancer in Pakistan and also serve as a resource for health planning in future research. Moreover, these models will be the most useful for modeling and projecting future trends of other cancers and chronic diseases.
基金supported by the National Natural Science Foundation of China(Grant No.42002134)China Postdoctoral Science Foundation(Grant No.2021T140735)Science Foundation of China University of Petroleum,Beijing(Grant Nos.2462020XKJS02 and 2462020YXZZ004).
文摘How to fit a properly nonlinear classification model from conventional well logs to lithofacies is a key problem for machine learning methods.Kernel methods(e.g.,KFD,SVM,MSVM)are effective attempts to solve this issue due to abilities of handling nonlinear features by kernel functions.Deep mining of log features indicating lithofacies still needs to be improved for kernel methods.Hence,this work employs deep neural networks to enhance the kernel principal component analysis(KPCA)method and proposes a deep kernel method(DKM)for lithofacies identification using well logs.DKM includes a feature extractor and a classifier.The feature extractor consists of a series of KPCA models arranged according to residual network structure.A gradient-free optimization method is introduced to automatically optimize parameters and structure in DKM,which can avoid complex tuning of parameters in models.To test the validation of the proposed DKM for lithofacies identification,an open-sourced dataset with seven con-ventional logs(GR,CAL,AC,DEN,CNL,LLD,and LLS)and lithofacies labels from the Daniudi Gas Field in China is used.There are eight lithofacies,namely clastic rocks(pebbly,coarse,medium,and fine sand-stone,siltstone,mudstone),coal,and carbonate rocks.The comparisons between DKM and three commonly used kernel methods(KFD,SVM,MSVM)show that(1)DKM(85.7%)outperforms SVM(77%),KFD(79.5%),and MSVM(82.8%)in accuracy of lithofacies identification;(2)DKM is about twice faster than the multi-kernel method(MSVM)with good accuracy.The blind well test in Well D13 indicates that compared with the other three methods DKM improves about 24%in accuracy,35%in precision,41%in recall,and 40%in F1 score,respectively.In general,DKM is an effective method for complex lithofacies identification.This work also discussed the optimal structure and classifier for DKM.Experimental re-sults show that(m_(1),m_(2),O)is the optimal model structure and linear svM is the optimal classifier.(m_(1),m_(2),O)means there are m KPCAs,and then m2 residual units.A workflow to determine an optimal classifier in DKM for lithofacies identification is proposed,too.