We investigated the parametric optimization on incremental sheet forming of stainless steel using Grey Relational Analysis(GRA) coupled with Principal Component Analysis(PCA). AISI 316L stainless steel sheets were use...We investigated the parametric optimization on incremental sheet forming of stainless steel using Grey Relational Analysis(GRA) coupled with Principal Component Analysis(PCA). AISI 316L stainless steel sheets were used to develop double wall angle pyramid with aid of tungsten carbide tool. GRA coupled with PCA was used to plan the experiment conditions. Control factors such as Tool Diameter(TD), Step Depth(SD), Bottom Wall Angle(BWA), Feed Rate(FR) and Spindle Speed(SS) on Top Wall Angle(TWA) and Top Wall Angle Surface Roughness(TWASR) have been studied. Wall angle increases with increasing tool diameter due to large contact area between tool and workpiece. As the step depth, feed rate and spindle speed increase,TWASR decreases with increasing tool diameter. As the step depth increasing, the hydrostatic stress is raised causing severe cracks in the deformed surface. Hence it was concluded that the proposed hybrid method was suitable for optimizing the factors and response.展开更多
Ore production is usually affected by multiple influencing inputs at open-pit mines.Nevertheless,the complex nonlinear relationships between these inputs and ore production remain unclear.This becomes even more challe...Ore production is usually affected by multiple influencing inputs at open-pit mines.Nevertheless,the complex nonlinear relationships between these inputs and ore production remain unclear.This becomes even more challenging when training data(e.g.truck haulage information and weather conditions)are massive.In machine learning(ML)algorithms,deep neural network(DNN)is a superior method for processing nonlinear and massive data by adjusting the amount of neurons and hidden layers.This study adopted DNN to forecast ore production using truck haulage information and weather conditions at open-pit mines as training data.Before the prediction models were built,principal component analysis(PCA)was employed to reduce the data dimensionality and eliminate the multicollinearity among highly correlated input variables.To verify the superiority of DNN,three ANNs containing only one hidden layer and six traditional ML models were established as benchmark models.The DNN model with multiple hidden layers performed better than the ANN models with a single hidden layer.The DNN model outperformed the extensively applied benchmark models in predicting ore production.This can provide engineers and researchers with an accurate method to forecast ore production,which helps make sound budgetary decisions and mine planning at open-pit mines.展开更多
The safety and integrity requirements of aerospace composite structures necessitate real-time health monitoring throughout their service life.To this end,distributed optical fiber sensors utilizing back Rayleigh scatt...The safety and integrity requirements of aerospace composite structures necessitate real-time health monitoring throughout their service life.To this end,distributed optical fiber sensors utilizing back Rayleigh scattering have been extensively deployed in structural health monitoring due to their advantages,such as lightweight and ease of embedding.However,identifying the precise location of damage from the optical fiber signals remains a critical challenge.In this paper,a novel approach which namely Modified Sliding Window Principal Component Analysis(MSWPCA)was proposed to facilitate automatic damage identification and localization via distributed optical fiber sensors.The proposed method is able to extract signal characteristics interfered by measurement noise to improve the accuracy of damage detection.Specifically,we applied the MSWPCA method to monitor and analyze the debonding propagation process in honeycomb sandwich panel structures.Our findings demonstrate that the training model exhibits high precision in detecting the location and size of honeycomb debonding,thereby facilitating reliable and efficient online assessment of the structural health state.展开更多
Principal Component Analysis (PCA) is a widely used technique for data analysis and dimensionality reduction, but its sensitivity to feature scale and outliers limits its applicability. Robust Principal Component Anal...Principal Component Analysis (PCA) is a widely used technique for data analysis and dimensionality reduction, but its sensitivity to feature scale and outliers limits its applicability. Robust Principal Component Analysis (RPCA) addresses these limitations by decomposing data into a low-rank matrix capturing the underlying structure and a sparse matrix identifying outliers, enhancing robustness against noise and outliers. This paper introduces a novel RPCA variant, Robust PCA Integrating Sparse and Low-rank Priors (RPCA-SL). Each prior targets a specific aspect of the data’s underlying structure and their combination allows for a more nuanced and accurate separation of the main data components from outliers and noise. Then RPCA-SL is solved by employing a proximal gradient algorithm for improved anomaly detection and data decomposition. Experimental results on simulation and real data demonstrate significant advancements.展开更多
The composition control of molten steel is one of the main functions in the ladle furnace(LF)refining process.In this study,a feasible model was established to predict the alloying element yield using principal compon...The composition control of molten steel is one of the main functions in the ladle furnace(LF)refining process.In this study,a feasible model was established to predict the alloying element yield using principal component analysis(PCA)and deep neural network(DNN).The PCA was used to eliminate collinearity and reduce the dimension of the input variables,and then the data processed by PCA were used to establish the DNN model.The prediction hit ratios for the Si element yield in the error ranges of±1%,±3%,and±5%are 54.0%,93.8%,and98.8%,respectively,whereas those of the Mn element yield in the error ranges of±1%,±2%,and±3%are 77.0%,96.3%,and 99.5%,respectively,in the PCA-DNN model.The results demonstrate that the PCA-DNN model performs better than the known models,such as the reference heat method,multiple linear regression,modified backpropagation,and DNN model.Meanwhile,the accurate prediction of the alloying element yield can greatly contribute to realizing a“narrow window”control of composition in molten steel.The construction of the prediction model for the element yield can also provide a reference for the development of an alloying control model in LF intelligent refining in the modern iron and steel industry.展开更多
The large blast furnace is essential equipment in the process of iron and steel manufacturing. Due to the complex operation process and frequent fluctuations of variables, conventional monitoring methods often bring f...The large blast furnace is essential equipment in the process of iron and steel manufacturing. Due to the complex operation process and frequent fluctuations of variables, conventional monitoring methods often bring false alarms. To address the above problem, an ensemble of greedy dynamic principal component analysis-Gaussian mixture model(EGDPCA-GMM) is proposed in this paper. First, PCA-GMM is introduced to deal with the collinearity and the non-Gaussian distribution of blast furnace data.Second, in order to explain the dynamics of data, the greedy algorithm is used to determine the extended variables and their corresponding time lags, so as to avoid introducing unnecessary noise. Then the bagging ensemble is adopted to cooperate with greedy extension to eliminate the randomness brought by the greedy algorithm and further reduce the false alarm rate(FAR) of monitoring results. Finally, the algorithm is applied to the blast furnace of a large iron and steel group in South China to verify performance.Compared with the basic algorithms, the proposed method achieves lowest FAR, while keeping missed alarm rate(MAR) remain stable.展开更多
Total organic carbon(TOC)content is one of the most important parameters for characterizing the quality of source rocks and assessing the hydrocarbon-generating potential of shales.The Lucaogou Formation shale reservo...Total organic carbon(TOC)content is one of the most important parameters for characterizing the quality of source rocks and assessing the hydrocarbon-generating potential of shales.The Lucaogou Formation shale reservoirs in the Jimusaer Sag,Junggar Basin,NW China,is characterized by extremely complex lithology and a wide variety of mineral compositions with source rocks mainly consisting of carbonaceous mudstone and dolomitic mudstone.The logging responses of organic matter in the shale reservoirs is quite different from those in conventional reservoirs.Analyses show that the traditional△logR method is not suitable for evaluating the TOC content in the study area.Analysis of the sensitivity characteristics of TOC content to well logs reveals that the TOC content has good correlation with the separation degree of porosity logs.After a dimension reduction processing by the principal component analysis technology,the principal components are determined through correlation analysis of porosity logs.The results show that the TOC values obtained by the new method are in good agreement with that measured by core analysis.The average absolute error of the new method is only 0.555,much less when compared with 1.222 of using traditional△logR method.The proposed method can be used to produce more accurate TOC estimates,thus providing a reliable basis for source rock mapping.展开更多
Principal component analysis(PCA)was employed to determine the implications of geochemical and isotopic data from Cenozoic volcanic activities in the Southeast Asian region,including China(South China Sea(SCS),Hainan ...Principal component analysis(PCA)was employed to determine the implications of geochemical and isotopic data from Cenozoic volcanic activities in the Southeast Asian region,including China(South China Sea(SCS),Hainan Island,Fujian-Zhejiang coast,Taiwan Island),and parts of Vietnam and Thailand.We analyzed 15 trace element indicators and 5 isotopic indicators for 623 volcanic rock samples collected from the study region.Two principal components(PCs)were extracted by PCA based on the trace elements and Sr-Nd-Pb isotopic ratios,which probably indicate an enriched oceanic island basalt-type mantle plume and a depleted mid-ocean ridge basalt-type spreading ridge.The results show that the influence of the Hainan mantle plume on younger volcanic activities(<13 Ma)is stronger than that on older ones(>13 Ma)at the same location in the Southeast Asian region.PCA was employed to verify the mantle-plume-ridge interaction model of volcanic activities beneath the expansion center of SCS and refute the hypothesis that the tension of SCS is triggered by the Hainan plume.This study reveals the efficiency and applicability of PCA in discussing mantle sources of volcanic activities;thus,PCA is a suitable research method for analyzing geochemical data.展开更多
In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tig...In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.展开更多
This work utilizes a statistical approach of Principal Component Ana-lysis(PCA)towards the detection of Methane(CH_(4))-Carbon Monoxide(CO)Poi-soning occurring in coal mines,forestfires,drainage systems etc.where the ...This work utilizes a statistical approach of Principal Component Ana-lysis(PCA)towards the detection of Methane(CH_(4))-Carbon Monoxide(CO)Poi-soning occurring in coal mines,forestfires,drainage systems etc.where the CH_(4) and CO emissions are very high in closed buildings or confined spaces during oxi-dation processes.Both methane and carbon monoxide are highly toxic,colorless and odorless gases.Both of the gases have their own toxic levels to be detected.But during their combined presence,the toxicity of the either one goes unidentified may be due to their low levels which may lead to an explosion.By using PCA,the correlation of CO and CH_(4) data is carried out and by identifying the areas of high correlation(along the principal component axis)the explosion suppression action can be triggered earlier thus avoiding adverse effects of massive explosions.Wire-less Sensor Network is deployed and simulations are carried with heterogeneous sensors(Carbon Monoxide and Methane sensors)in NS-2 Mannasim framework.The rise in the value of CO even when CH_(4) is below the toxic level may become hazardous to the people around.Thus our proposed methodology will detect the combined presence of both the gases(CH_(4) and CO)and provide an early warning in order to avoid any human losses or toxic effects.展开更多
Machine learning algorithms (MLs) can potentially improve disease diagnostics, leading to early detection and treatment of these diseases. As a malignant tumor whose primary focus is located in the bronchial mucosal e...Machine learning algorithms (MLs) can potentially improve disease diagnostics, leading to early detection and treatment of these diseases. As a malignant tumor whose primary focus is located in the bronchial mucosal epithelium, lung cancer has the highest mortality and morbidity among cancer types, threatening health and life of patients suffering from the disease. Machine learning algorithms such as Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Naïve Bayes (NB) have been used for lung cancer prediction. However they still face challenges such as high dimensionality of the feature space, over-fitting, high computational complexity, noise and missing data, low accuracies, low precision and high error rates. Ensemble learning, which combines classifiers, may be helpful to boost prediction on new data. However, current ensemble ML techniques rarely consider comprehensive evaluation metrics to evaluate the performance of individual classifiers. The main purpose of this study was to develop an ensemble classifier that improves lung cancer prediction. An ensemble machine learning algorithm is developed based on RF, SVM, NB, and KNN. Feature selection is done based on Principal Component Analysis (PCA) and Analysis of Variance (ANOVA). This algorithm is then executed on lung cancer data and evaluated using execution time, true positives (TP), true negatives (TN), false positives (FP), false negatives (FN), false positive rate (FPR), recall (R), precision (P) and F-measure (FM). Experimental results show that the proposed ensemble classifier has the best classification of 0.9825% with the lowest error rate of 0.0193. This is followed by SVM in which the probability of having the best classification is 0.9652% at an error rate of 0.0206. On the other hand, NB had the worst performance of 0.8475% classification at 0.0738 error rate.展开更多
Principal component analysis (PCA) was employed to examine the effect of nutritional and bioactive compounds of legume milk chocolate as well as the sensory to document the extend of variations and their significance ...Principal component analysis (PCA) was employed to examine the effect of nutritional and bioactive compounds of legume milk chocolate as well as the sensory to document the extend of variations and their significance with plant sources. PCA identified eight significant principle components, that reduce the size of the variables into one principal component in physiochemical analysis interpreting 73.5% of the total variability with/and 78.6% of total variability explained in sensory evaluation. Score plot indicates that Double Bean milk chocolate in-corporated with MOL and CML in nutritional profile have high positive correlations. In nutritional evaluation, carbohydrates and fat content shows negative/minimal correlations whereas no negative correlations were found in sensory evaluation which implies every sensorial variable had high correlation with each other.展开更多
Due to the scarcity of resources of Ziziphi spinosae semen (ZSS), many inferior goods and even adulterants are generally found in medicine markets. To strengthen the quality control, HPLC fingerprint common pattern ...Due to the scarcity of resources of Ziziphi spinosae semen (ZSS), many inferior goods and even adulterants are generally found in medicine markets. To strengthen the quality control, HPLC fingerprint common pattern established in this paper showed three main bioactive compounds in one chromatogram simultaneously. Principal component analysis based on DAD signals could discriminate adulterants and inferiorities. Principal component analysis indicated that all samples could be mainly regrouped into two main clusters according to the first principal component (PC1, redefined as Vicenin II) and the second principal component (PC2, redefined as zizyphusine). PC1 and PC2 could explain 91.42%of the variance. Content of zizyphusine fluctuated more greatly than that of spinosin, and this result was also confirmed by the HPTLC result. Samples with low content of jujubosides and two common adulterants could not be used equivalently with authenticated ones in clinic, while one reference standard extract could substitute the crude drug in pharmaceutical production. Giving special consideration to the well-known bioactive saponins but with low response by end absorption, a fast and cheap HPTLC method for quality control of ZSS was developed and the result obtained was commensurate well with that of HPLC analysis. Samples having similar fingerprints to HPTLC common pattern targeting at saponins could be regarded as authenticated ones. This work provided a faster and cheaper way for quality control of ZSS and laid foundation for establishing a more effective quality control method for ZSS.展开更多
Texture qualities of cooked rice are comprised of many indexes with the complex relationship, so it is difficult to analyze and evaluate cooked rice. In this paper, the related indexes of texture properties were conve...Texture qualities of cooked rice are comprised of many indexes with the complex relationship, so it is difficult to analyze and evaluate cooked rice. In this paper, the related indexes of texture properties were conversed into the independent indexes of principal component based on the principal component analysis method. The results showed that the rice kernel types influenced the meanings of principal components indexes. For long and short rice, the first principal component was comprehensive index. But the second principal component was springiness for the short rice, while it was adhesiveness for long rice. Therefore, the first principal component can be used to express the quality of cooked rice with a few of indexes, and the rice type can be recognized according to the second principal component.展开更多
In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonline...In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonlinear industrial process. Kernel PCA (KPCA) is extensionof PCA and can be used for nonlinear feature analysis. A nonlinear data reconciliation method basedon KPCA is proposed. The basic idea of this method is that firstly original data are mapped to highdimensional feature space by nonlinear function, and PCA is implemented in the feature space. Thennonlinear feature analysis is implemented and data are reconstructed by using the kernel. The datareconciliation method based on KPCA is applied to ternary distillation column. Simulation resultsshow that this method can filter the noise in measurements of nonlinear process and reconciliateddata can represent the true information of nonlinear process.展开更多
Laser-induced breakdown spectroscopy(LIBS) is a versatile tool for both qualitative and quantitative analysis.In this paper,LIBS combined with principal component analysis(PCA) and support vector machine(SVM) is...Laser-induced breakdown spectroscopy(LIBS) is a versatile tool for both qualitative and quantitative analysis.In this paper,LIBS combined with principal component analysis(PCA) and support vector machine(SVM) is applied to rock analysis.Fourteen emission lines including Fe,Mg,Ca,Al,Si,and Ti are selected as analysis lines.A good accuracy(91.38% for the real rock) is achieved by using SVM to analyze the spectroscopic peak area data which are processed by PCA.It can not only reduce the noise and dimensionality which contributes to improving the efficiency of the program,but also solve the problem of linear inseparability by combining PCA and SVM.By this method,the ability of LIBS to classify rock is validated.展开更多
An updated approach to refining the core indicators of pulverized coal used for blast furnace injection based on principal component analysis is proposed in view of the disadvantages of the existing performance indica...An updated approach to refining the core indicators of pulverized coal used for blast furnace injection based on principal component analysis is proposed in view of the disadvantages of the existing performance indicator system of pulverized coal used in blast furnaces. This presented method takes into account all the performance indicators of pulverized coal injection, including calorific value, igniting point, combustibility, reactivity, flowability, grindability, etc. Four core indicators of pulverized coal injection are selected and studied by using principal component analysis, namely, comprehensive combustibility, comprehensive reactivity, comprehensive flowability, and comprehensive grindability. The newly established core index system is not only beneficial to narrowing down current evaluation indices but also effective to avoid previous overlapping problems among indicators by mutually independent index design. Furthermore, a comprehensive property indicator is introduced on the basis of the four core indicators, and the injection properties of pulverized coal can be overall evaluated.展开更多
To overcome the too fine-grained granularity problem of multivariate grey incidence analysis and to explore the comprehensive incidence analysis model, three multivariate grey incidences degree models based on princip...To overcome the too fine-grained granularity problem of multivariate grey incidence analysis and to explore the comprehensive incidence analysis model, three multivariate grey incidences degree models based on principal component analysis (PCA) are proposed. Firstly, the PCA method is introduced to extract the feature sequences of a behavioral matrix. Then, the grey incidence analysis between two behavioral matrices is transformed into the similarity and nearness measure between their feature sequences. Based on the classic grey incidence analysis theory, absolute and relative incidence degree models for feature sequences are constructed, and a comprehensive grey incidence model is proposed. Furthermore, the properties of models are researched. It proves that the proposed models satisfy the properties of translation invariance, multiple transformation invariance, and axioms of the grey incidence analysis, respectively. Finally, a case is studied. The results illustrate that the model is effective than other multivariate grey incidence analysis models.展开更多
Continued innovation in screening methodologies remains important for the discovery of high-quality multiactive fungi,which have been of great significance to the development of new drugs.Mangrove-derived fungi,which ...Continued innovation in screening methodologies remains important for the discovery of high-quality multiactive fungi,which have been of great significance to the development of new drugs.Mangrove-derived fungi,which are well recognized as prolific sources of natural products,are worth sustained attention and further study.In this study,118 fungi,which mainly included Aspergillus spp.(34.62%)and Penicillium spp.(15.38%),were isolated from the mangrove ecosystem of the Maowei Sea,and 83.1%of the cultured fungi showed at least one bioactivity in four antibacterial and three antioxidant assays.To accurately evaluate the fungal bioactivities,the fungi with multiple bioactivities were successfully evaluated and screened by principal component analysis(PCA),and this analysis provided a dataset for comparing and selecting multibioactive fungi.Among the 118 mangrove-derived fungi tested in this study,Aspergillus spp.showed the best comprehensive activity.Fungi such as A.clavatonanicus,A.flavipes and A.citrinoterreus,which exhibited high comprehensive bioactivity as determined by the PCA,have great potential in the exploitation of natural products and the development of new drugs.This study demonstrated the first use of PCA as a time-saving,scientific method with a strong ability to evaluate and screen multiactive fungi,which indicated that this method can affect the discovery and development of new drugs.展开更多
Power load forecasting accuracy related to the development of the power system. There were so many factors influencing the power load, but their effects were not the same and what factors played a leading role could n...Power load forecasting accuracy related to the development of the power system. There were so many factors influencing the power load, but their effects were not the same and what factors played a leading role could not be determined empirically. Based on the analysis of the principal component, the paper forecasted the demands of power load with the method of the multivariate linear regression model prediction. Took the rural power grid load for example, the paper analyzed the impacts of different factors on power load, selected the forecast methods which were appropriate for using in this area, forecasted its 2014-2018 electricity load, and provided a reliable basis for grid planning.展开更多
文摘We investigated the parametric optimization on incremental sheet forming of stainless steel using Grey Relational Analysis(GRA) coupled with Principal Component Analysis(PCA). AISI 316L stainless steel sheets were used to develop double wall angle pyramid with aid of tungsten carbide tool. GRA coupled with PCA was used to plan the experiment conditions. Control factors such as Tool Diameter(TD), Step Depth(SD), Bottom Wall Angle(BWA), Feed Rate(FR) and Spindle Speed(SS) on Top Wall Angle(TWA) and Top Wall Angle Surface Roughness(TWASR) have been studied. Wall angle increases with increasing tool diameter due to large contact area between tool and workpiece. As the step depth, feed rate and spindle speed increase,TWASR decreases with increasing tool diameter. As the step depth increasing, the hydrostatic stress is raised causing severe cracks in the deformed surface. Hence it was concluded that the proposed hybrid method was suitable for optimizing the factors and response.
基金This work was supported by the Pilot Seed Grant(Grant No.RES0049944)the Collaborative Research Project(Grant No.RES0043251)from the University of Alberta.
文摘Ore production is usually affected by multiple influencing inputs at open-pit mines.Nevertheless,the complex nonlinear relationships between these inputs and ore production remain unclear.This becomes even more challenging when training data(e.g.truck haulage information and weather conditions)are massive.In machine learning(ML)algorithms,deep neural network(DNN)is a superior method for processing nonlinear and massive data by adjusting the amount of neurons and hidden layers.This study adopted DNN to forecast ore production using truck haulage information and weather conditions at open-pit mines as training data.Before the prediction models were built,principal component analysis(PCA)was employed to reduce the data dimensionality and eliminate the multicollinearity among highly correlated input variables.To verify the superiority of DNN,three ANNs containing only one hidden layer and six traditional ML models were established as benchmark models.The DNN model with multiple hidden layers performed better than the ANN models with a single hidden layer.The DNN model outperformed the extensively applied benchmark models in predicting ore production.This can provide engineers and researchers with an accurate method to forecast ore production,which helps make sound budgetary decisions and mine planning at open-pit mines.
基金supported by the National Key Research and Development Program of China(No.2018YFA0702800)the National Natural Science Foundation of China(No.12072056)supported by National Defense Fundamental Scientific Research Project(XXXX2018204BXXX).
文摘The safety and integrity requirements of aerospace composite structures necessitate real-time health monitoring throughout their service life.To this end,distributed optical fiber sensors utilizing back Rayleigh scattering have been extensively deployed in structural health monitoring due to their advantages,such as lightweight and ease of embedding.However,identifying the precise location of damage from the optical fiber signals remains a critical challenge.In this paper,a novel approach which namely Modified Sliding Window Principal Component Analysis(MSWPCA)was proposed to facilitate automatic damage identification and localization via distributed optical fiber sensors.The proposed method is able to extract signal characteristics interfered by measurement noise to improve the accuracy of damage detection.Specifically,we applied the MSWPCA method to monitor and analyze the debonding propagation process in honeycomb sandwich panel structures.Our findings demonstrate that the training model exhibits high precision in detecting the location and size of honeycomb debonding,thereby facilitating reliable and efficient online assessment of the structural health state.
文摘Principal Component Analysis (PCA) is a widely used technique for data analysis and dimensionality reduction, but its sensitivity to feature scale and outliers limits its applicability. Robust Principal Component Analysis (RPCA) addresses these limitations by decomposing data into a low-rank matrix capturing the underlying structure and a sparse matrix identifying outliers, enhancing robustness against noise and outliers. This paper introduces a novel RPCA variant, Robust PCA Integrating Sparse and Low-rank Priors (RPCA-SL). Each prior targets a specific aspect of the data’s underlying structure and their combination allows for a more nuanced and accurate separation of the main data components from outliers and noise. Then RPCA-SL is solved by employing a proximal gradient algorithm for improved anomaly detection and data decomposition. Experimental results on simulation and real data demonstrate significant advancements.
基金supported by the National Natural Science Foundation of China(No.51974023)State Key Laboratory of Advanced Metallurgy,University of Science and Technology Beijing(No.41621005)。
文摘The composition control of molten steel is one of the main functions in the ladle furnace(LF)refining process.In this study,a feasible model was established to predict the alloying element yield using principal component analysis(PCA)and deep neural network(DNN).The PCA was used to eliminate collinearity and reduce the dimension of the input variables,and then the data processed by PCA were used to establish the DNN model.The prediction hit ratios for the Si element yield in the error ranges of±1%,±3%,and±5%are 54.0%,93.8%,and98.8%,respectively,whereas those of the Mn element yield in the error ranges of±1%,±2%,and±3%are 77.0%,96.3%,and 99.5%,respectively,in the PCA-DNN model.The results demonstrate that the PCA-DNN model performs better than the known models,such as the reference heat method,multiple linear regression,modified backpropagation,and DNN model.Meanwhile,the accurate prediction of the alloying element yield can greatly contribute to realizing a“narrow window”control of composition in molten steel.The construction of the prediction model for the element yield can also provide a reference for the development of an alloying control model in LF intelligent refining in the modern iron and steel industry.
基金supported by the National Natural Science Foundation of China (61903326, 61933015)。
文摘The large blast furnace is essential equipment in the process of iron and steel manufacturing. Due to the complex operation process and frequent fluctuations of variables, conventional monitoring methods often bring false alarms. To address the above problem, an ensemble of greedy dynamic principal component analysis-Gaussian mixture model(EGDPCA-GMM) is proposed in this paper. First, PCA-GMM is introduced to deal with the collinearity and the non-Gaussian distribution of blast furnace data.Second, in order to explain the dynamics of data, the greedy algorithm is used to determine the extended variables and their corresponding time lags, so as to avoid introducing unnecessary noise. Then the bagging ensemble is adopted to cooperate with greedy extension to eliminate the randomness brought by the greedy algorithm and further reduce the false alarm rate(FAR) of monitoring results. Finally, the algorithm is applied to the blast furnace of a large iron and steel group in South China to verify performance.Compared with the basic algorithms, the proposed method achieves lowest FAR, while keeping missed alarm rate(MAR) remain stable.
基金This research was funded by the National Natural Science Foundation of China(Grant No.41504103).
文摘Total organic carbon(TOC)content is one of the most important parameters for characterizing the quality of source rocks and assessing the hydrocarbon-generating potential of shales.The Lucaogou Formation shale reservoirs in the Jimusaer Sag,Junggar Basin,NW China,is characterized by extremely complex lithology and a wide variety of mineral compositions with source rocks mainly consisting of carbonaceous mudstone and dolomitic mudstone.The logging responses of organic matter in the shale reservoirs is quite different from those in conventional reservoirs.Analyses show that the traditional△logR method is not suitable for evaluating the TOC content in the study area.Analysis of the sensitivity characteristics of TOC content to well logs reveals that the TOC content has good correlation with the separation degree of porosity logs.After a dimension reduction processing by the principal component analysis technology,the principal components are determined through correlation analysis of porosity logs.The results show that the TOC values obtained by the new method are in good agreement with that measured by core analysis.The average absolute error of the new method is only 0.555,much less when compared with 1.222 of using traditional△logR method.The proposed method can be used to produce more accurate TOC estimates,thus providing a reliable basis for source rock mapping.
基金Supported by the State Key Laboratory of Marine Environmental Science Visiting Fellowship(No.MELRS2233)the State Key Laboratory of Marine Geology,Tongji University(No.MGK202302)+4 种基金the Innovation Group Project of Southern Marine Science and Engineering Guangdong Laboratory(Zhuhai)(No.311021003)the Zhujiang Talent Project Foundation of Guangdong Province(No.2017ZT07Z066)the Fundamental Research Funds for the Central Universities,Sun Yat-sen University(Nos.22qntd2101,2021qntd23)the Major Projects of the National Natural Science Foundation of China(Nos.41790465,41590863)the National Natural Science Foundation of China(Nos.42102333,41806077,41904045)。
文摘Principal component analysis(PCA)was employed to determine the implications of geochemical and isotopic data from Cenozoic volcanic activities in the Southeast Asian region,including China(South China Sea(SCS),Hainan Island,Fujian-Zhejiang coast,Taiwan Island),and parts of Vietnam and Thailand.We analyzed 15 trace element indicators and 5 isotopic indicators for 623 volcanic rock samples collected from the study region.Two principal components(PCs)were extracted by PCA based on the trace elements and Sr-Nd-Pb isotopic ratios,which probably indicate an enriched oceanic island basalt-type mantle plume and a depleted mid-ocean ridge basalt-type spreading ridge.The results show that the influence of the Hainan mantle plume on younger volcanic activities(<13 Ma)is stronger than that on older ones(>13 Ma)at the same location in the Southeast Asian region.PCA was employed to verify the mantle-plume-ridge interaction model of volcanic activities beneath the expansion center of SCS and refute the hypothesis that the tension of SCS is triggered by the Hainan plume.This study reveals the efficiency and applicability of PCA in discussing mantle sources of volcanic activities;thus,PCA is a suitable research method for analyzing geochemical data.
基金funded by the National Natural Science Foundation of China(42174131)the Strategic Cooperation Technology Projects of CNPC and CUPB(ZLZX2020-03).
文摘In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.
文摘This work utilizes a statistical approach of Principal Component Ana-lysis(PCA)towards the detection of Methane(CH_(4))-Carbon Monoxide(CO)Poi-soning occurring in coal mines,forestfires,drainage systems etc.where the CH_(4) and CO emissions are very high in closed buildings or confined spaces during oxi-dation processes.Both methane and carbon monoxide are highly toxic,colorless and odorless gases.Both of the gases have their own toxic levels to be detected.But during their combined presence,the toxicity of the either one goes unidentified may be due to their low levels which may lead to an explosion.By using PCA,the correlation of CO and CH_(4) data is carried out and by identifying the areas of high correlation(along the principal component axis)the explosion suppression action can be triggered earlier thus avoiding adverse effects of massive explosions.Wire-less Sensor Network is deployed and simulations are carried with heterogeneous sensors(Carbon Monoxide and Methane sensors)in NS-2 Mannasim framework.The rise in the value of CO even when CH_(4) is below the toxic level may become hazardous to the people around.Thus our proposed methodology will detect the combined presence of both the gases(CH_(4) and CO)and provide an early warning in order to avoid any human losses or toxic effects.
文摘Machine learning algorithms (MLs) can potentially improve disease diagnostics, leading to early detection and treatment of these diseases. As a malignant tumor whose primary focus is located in the bronchial mucosal epithelium, lung cancer has the highest mortality and morbidity among cancer types, threatening health and life of patients suffering from the disease. Machine learning algorithms such as Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Naïve Bayes (NB) have been used for lung cancer prediction. However they still face challenges such as high dimensionality of the feature space, over-fitting, high computational complexity, noise and missing data, low accuracies, low precision and high error rates. Ensemble learning, which combines classifiers, may be helpful to boost prediction on new data. However, current ensemble ML techniques rarely consider comprehensive evaluation metrics to evaluate the performance of individual classifiers. The main purpose of this study was to develop an ensemble classifier that improves lung cancer prediction. An ensemble machine learning algorithm is developed based on RF, SVM, NB, and KNN. Feature selection is done based on Principal Component Analysis (PCA) and Analysis of Variance (ANOVA). This algorithm is then executed on lung cancer data and evaluated using execution time, true positives (TP), true negatives (TN), false positives (FP), false negatives (FN), false positive rate (FPR), recall (R), precision (P) and F-measure (FM). Experimental results show that the proposed ensemble classifier has the best classification of 0.9825% with the lowest error rate of 0.0193. This is followed by SVM in which the probability of having the best classification is 0.9652% at an error rate of 0.0206. On the other hand, NB had the worst performance of 0.8475% classification at 0.0738 error rate.
文摘Principal component analysis (PCA) was employed to examine the effect of nutritional and bioactive compounds of legume milk chocolate as well as the sensory to document the extend of variations and their significance with plant sources. PCA identified eight significant principle components, that reduce the size of the variables into one principal component in physiochemical analysis interpreting 73.5% of the total variability with/and 78.6% of total variability explained in sensory evaluation. Score plot indicates that Double Bean milk chocolate in-corporated with MOL and CML in nutritional profile have high positive correlations. In nutritional evaluation, carbohydrates and fat content shows negative/minimal correlations whereas no negative correlations were found in sensory evaluation which implies every sensorial variable had high correlation with each other.
文摘Due to the scarcity of resources of Ziziphi spinosae semen (ZSS), many inferior goods and even adulterants are generally found in medicine markets. To strengthen the quality control, HPLC fingerprint common pattern established in this paper showed three main bioactive compounds in one chromatogram simultaneously. Principal component analysis based on DAD signals could discriminate adulterants and inferiorities. Principal component analysis indicated that all samples could be mainly regrouped into two main clusters according to the first principal component (PC1, redefined as Vicenin II) and the second principal component (PC2, redefined as zizyphusine). PC1 and PC2 could explain 91.42%of the variance. Content of zizyphusine fluctuated more greatly than that of spinosin, and this result was also confirmed by the HPTLC result. Samples with low content of jujubosides and two common adulterants could not be used equivalently with authenticated ones in clinic, while one reference standard extract could substitute the crude drug in pharmaceutical production. Giving special consideration to the well-known bioactive saponins but with low response by end absorption, a fast and cheap HPTLC method for quality control of ZSS was developed and the result obtained was commensurate well with that of HPLC analysis. Samples having similar fingerprints to HPTLC common pattern targeting at saponins could be regarded as authenticated ones. This work provided a faster and cheaper way for quality control of ZSS and laid foundation for establishing a more effective quality control method for ZSS.
基金Education Department of Heilongjiang Province in China for the Oversea Researcher Projects(1151HZ01,10531002)
文摘Texture qualities of cooked rice are comprised of many indexes with the complex relationship, so it is difficult to analyze and evaluate cooked rice. In this paper, the related indexes of texture properties were conversed into the independent indexes of principal component based on the principal component analysis method. The results showed that the rice kernel types influenced the meanings of principal components indexes. For long and short rice, the first principal component was comprehensive index. But the second principal component was springiness for the short rice, while it was adhesiveness for long rice. Therefore, the first principal component can be used to express the quality of cooked rice with a few of indexes, and the rice type can be recognized according to the second principal component.
基金This project is supported by Special Foundation for Major State Basic Research of China (Project 973, No.G1998030415)
文摘In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonlinear industrial process. Kernel PCA (KPCA) is extensionof PCA and can be used for nonlinear feature analysis. A nonlinear data reconciliation method basedon KPCA is proposed. The basic idea of this method is that firstly original data are mapped to highdimensional feature space by nonlinear function, and PCA is implemented in the feature space. Thennonlinear feature analysis is implemented and data are reconstructed by using the kernel. The datareconciliation method based on KPCA is applied to ternary distillation column. Simulation resultsshow that this method can filter the noise in measurements of nonlinear process and reconciliateddata can represent the true information of nonlinear process.
基金Project supported by the National Natural Science Foundation of China(Grant No.11075184)the Knowledge Innovation Program of the Chinese Academy of Sciences(CAS)(Grant No.Y03RC21124)the CAS President’s International Fellowship Initiative Foundation(Grant No.2015VMA007)
文摘Laser-induced breakdown spectroscopy(LIBS) is a versatile tool for both qualitative and quantitative analysis.In this paper,LIBS combined with principal component analysis(PCA) and support vector machine(SVM) is applied to rock analysis.Fourteen emission lines including Fe,Mg,Ca,Al,Si,and Ti are selected as analysis lines.A good accuracy(91.38% for the real rock) is achieved by using SVM to analyze the spectroscopic peak area data which are processed by PCA.It can not only reduce the noise and dimensionality which contributes to improving the efficiency of the program,but also solve the problem of linear inseparability by combining PCA and SVM.By this method,the ability of LIBS to classify rock is validated.
基金financially supported by the Young Talent Cultivation Fund in Universities (No. FRF-TP-12-020A)the National Natural Science Foundation of China (Nos. 51204013 and 51174023)
文摘An updated approach to refining the core indicators of pulverized coal used for blast furnace injection based on principal component analysis is proposed in view of the disadvantages of the existing performance indicator system of pulverized coal used in blast furnaces. This presented method takes into account all the performance indicators of pulverized coal injection, including calorific value, igniting point, combustibility, reactivity, flowability, grindability, etc. Four core indicators of pulverized coal injection are selected and studied by using principal component analysis, namely, comprehensive combustibility, comprehensive reactivity, comprehensive flowability, and comprehensive grindability. The newly established core index system is not only beneficial to narrowing down current evaluation indices but also effective to avoid previous overlapping problems among indicators by mutually independent index design. Furthermore, a comprehensive property indicator is introduced on the basis of the four core indicators, and the injection properties of pulverized coal can be overall evaluated.
基金supported by the National Natural Science Foundation of China(71401052)the Key Project of National Social Science Fund of China(12AZD108)+2 种基金the Doctoral Fund of Ministry of Education(20120094120024)the Philosophy and Social Science Fund of Jiangsu Province Universities(2013SJD630073)the Central University Basic Service Project Fee of Hohai University(2011B09914)
文摘To overcome the too fine-grained granularity problem of multivariate grey incidence analysis and to explore the comprehensive incidence analysis model, three multivariate grey incidences degree models based on principal component analysis (PCA) are proposed. Firstly, the PCA method is introduced to extract the feature sequences of a behavioral matrix. Then, the grey incidence analysis between two behavioral matrices is transformed into the similarity and nearness measure between their feature sequences. Based on the classic grey incidence analysis theory, absolute and relative incidence degree models for feature sequences are constructed, and a comprehensive grey incidence model is proposed. Furthermore, the properties of models are researched. It proves that the proposed models satisfy the properties of translation invariance, multiple transformation invariance, and axioms of the grey incidence analysis, respectively. Finally, a case is studied. The results illustrate that the model is effective than other multivariate grey incidence analysis models.
基金the Key R&D Program of Shandong Province(No.2020CXGC010703)the Key Project of the Natural Science Foundation of Shandong Province(No.ZR2020 KB021)。
文摘Continued innovation in screening methodologies remains important for the discovery of high-quality multiactive fungi,which have been of great significance to the development of new drugs.Mangrove-derived fungi,which are well recognized as prolific sources of natural products,are worth sustained attention and further study.In this study,118 fungi,which mainly included Aspergillus spp.(34.62%)and Penicillium spp.(15.38%),were isolated from the mangrove ecosystem of the Maowei Sea,and 83.1%of the cultured fungi showed at least one bioactivity in four antibacterial and three antioxidant assays.To accurately evaluate the fungal bioactivities,the fungi with multiple bioactivities were successfully evaluated and screened by principal component analysis(PCA),and this analysis provided a dataset for comparing and selecting multibioactive fungi.Among the 118 mangrove-derived fungi tested in this study,Aspergillus spp.showed the best comprehensive activity.Fungi such as A.clavatonanicus,A.flavipes and A.citrinoterreus,which exhibited high comprehensive bioactivity as determined by the PCA,have great potential in the exploitation of natural products and the development of new drugs.This study demonstrated the first use of PCA as a time-saving,scientific method with a strong ability to evaluate and screen multiactive fungi,which indicated that this method can affect the discovery and development of new drugs.
基金Supported by the Science and Technology Research Project Fund of Provincial Department of Education(12531004)Project of Heilongjiang Leading Talent Echelon Talented(2012)
文摘Power load forecasting accuracy related to the development of the power system. There were so many factors influencing the power load, but their effects were not the same and what factors played a leading role could not be determined empirically. Based on the analysis of the principal component, the paper forecasted the demands of power load with the method of the multivariate linear regression model prediction. Took the rural power grid load for example, the paper analyzed the impacts of different factors on power load, selected the forecast methods which were appropriate for using in this area, forecasted its 2014-2018 electricity load, and provided a reliable basis for grid planning.