The diameter distribution function(DDF)is a crucial tool for accurately predicting stand carbon storage(CS).The current key issue,however,is how to construct a high-precision DDF based on stand factors,site quality,an...The diameter distribution function(DDF)is a crucial tool for accurately predicting stand carbon storage(CS).The current key issue,however,is how to construct a high-precision DDF based on stand factors,site quality,and aridity index to predict stand CS in multi-species mixed forests with complex structures.This study used data from70 survey plots for mixed broadleaf Populus davidiana and Betula platyphylla forests in the Mulan Rangeland State Forest,Hebei Province,China,to construct the DDF based on maximum likelihood estimation and finite mixture model(FMM).Ordinary least squares(OLS),linear seemingly unrelated regression(LSUR),and back propagation neural network(BPNN)were used to investigate the influences of stand factors,site quality,and aridity index on the shape and scale parameters of DDF and predicted stand CS of mixed broadleaf forests.The results showed that FMM accurately described the stand-level diameter distribution of the mixed P.davidiana and B.platyphylla forests;whereas the Weibull function constructed by MLE was more accurate in describing species-level diameter distribution.The combined variable of quadratic mean diameter(Dq),stand basal area(BA),and site quality improved the accuracy of the shape parameter models of FMM;the combined variable of Dq,BA,and De Martonne aridity index improved the accuracy of the scale parameter models.Compared to OLS and LSUR,the BPNN had higher accuracy in the re-parameterization process of FMM.OLS,LSUR,and BPNN overestimated the CS of P.davidiana but underestimated the CS of B.platyphylla in the large diameter classes(DBH≥18 cm).BPNN accurately estimated stand-and species-level CS,but it was more suitable for estimating stand-level CS compared to species-level CS,thereby providing a scientific basis for the optimization of stand structure and assessment of carbon sequestration capacity in mixed broadleaf forests.展开更多
Classical survival analysis assumes all subjects will experience the event of interest, but in some cases, a portion of the population may never encounter the event. These survival methods further assume independent s...Classical survival analysis assumes all subjects will experience the event of interest, but in some cases, a portion of the population may never encounter the event. These survival methods further assume independent survival times, which is not valid for honey bees, which live in nests. The study introduces a semi-parametric marginal proportional hazards mixture cure (PHMC) model with exchangeable correlation structure, using generalized estimating equations for survival data analysis. The model was tested on clustered right-censored bees survival data with a cured fraction, where two bee species were subjected to different entomopathogens to test the effect of the entomopathogens on the survival of the bee species. The Expectation-Solution algorithm is used to estimate the parameters. The study notes a weak positive association between cure statuses (ρ1=0.0007) and survival times for uncured bees (ρ2=0.0890), emphasizing their importance. The odds of being uncured for A. mellifera is higher than the odds for species M. ferruginea. The bee species, A. mellifera are more susceptible to entomopathogens icipe 7, icipe 20, and icipe 69. The Cox-Snell residuals show that the proposed semiparametric PH model generally fits the data well as compared to model that assume independent correlation structure. Thus, the semi parametric marginal proportional hazards mixture cure is parsimonious model for correlated bees survival data.展开更多
The large blast furnace is essential equipment in the process of iron and steel manufacturing. Due to the complex operation process and frequent fluctuations of variables, conventional monitoring methods often bring f...The large blast furnace is essential equipment in the process of iron and steel manufacturing. Due to the complex operation process and frequent fluctuations of variables, conventional monitoring methods often bring false alarms. To address the above problem, an ensemble of greedy dynamic principal component analysis-Gaussian mixture model(EGDPCA-GMM) is proposed in this paper. First, PCA-GMM is introduced to deal with the collinearity and the non-Gaussian distribution of blast furnace data.Second, in order to explain the dynamics of data, the greedy algorithm is used to determine the extended variables and their corresponding time lags, so as to avoid introducing unnecessary noise. Then the bagging ensemble is adopted to cooperate with greedy extension to eliminate the randomness brought by the greedy algorithm and further reduce the false alarm rate(FAR) of monitoring results. Finally, the algorithm is applied to the blast furnace of a large iron and steel group in South China to verify performance.Compared with the basic algorithms, the proposed method achieves lowest FAR, while keeping missed alarm rate(MAR) remain stable.展开更多
The relationship between the water content or saturation of unsaturated soils and its matrix suction is commonly described by the soilwater characteristic curve(SWCC).Currently,study on the SWCC model is focused on fi...The relationship between the water content or saturation of unsaturated soils and its matrix suction is commonly described by the soilwater characteristic curve(SWCC).Currently,study on the SWCC model is focused on fine-grained soils like clay and silty soils,but the SWCC model for grinding soil-rock mixture(SRM)is less studied.Considering that the SRM is in a certain compaction state in the actual project,this study established a surface model with three variables of coupling compaction degree-substrate suction-moisture content based on the Cavalcante-Zornberg soil-water characteristic curve model.Then,the influence of each fitting parameter on the curve was analyzed.For the common SRM,the soil-water characteristic test was conducted.Moreover,the experimental measurements exhibit remarkable consistency with the mode surface.The analysis shows that the surface model intuitively describes the soil-water characteristics of grinding SRM,which can provide the SWCC of soils with bimodal pore characteristics under specific compaction degrees.Furthermore,it can reflect the influence of compaction degrees on the SWCC of rock-soil mass and has a certain prediction effect.The SWCC of SRM with various soil-rock ratios have a double-step shape.With the increase in compaction degree,the curves as a whole tend toward decreasing mass moisture content.The curve changes are mainly concentrated in the large pore section.展开更多
Objective Body fluid mixtures are complex biological samples that frequently occur in crime scenes,and can provide important clues for criminal case analysis.DNA methylation assay has been applied in the identificatio...Objective Body fluid mixtures are complex biological samples that frequently occur in crime scenes,and can provide important clues for criminal case analysis.DNA methylation assay has been applied in the identification of human body fluids,and has exhibited excellent performance in predicting single-source body fluids.The present study aims to develop a methylation SNaPshot multiplex system for body fluid identification,and accurately predict the mixture samples.In addition,the value of DNA methylation in the prediction of body fluid mixtures was further explored.Methods In the present study,420 samples of body fluid mixtures and 250 samples of single body fluids were tested using an optimized multiplex methylation system.Each kind of body fluid sample presented the specific methylation profiles of the 10 markers.Results Significant differences in methylation levels were observed between the mixtures and single body fluids.For all kinds of mixtures,the Spearman’s correlation analysis revealed a significantly strong correlation between the methylation levels and component proportions(1:20,1:10,1:5,1:1,5:1,10:1 and 20:1).Two random forest classification models were trained for the prediction of mixture types and the prediction of the mixture proportion of 2 components,based on the methylation levels of 10 markers.For the mixture prediction,Model-1 presented outstanding prediction accuracy,which reached up to 99.3%in 427 training samples,and had a remarkable accuracy of 100%in 243 independent test samples.For the mixture proportion prediction,Model-2 demonstrated an excellent accuracy of 98.8%in 252 training samples,and 98.2%in 168 independent test samples.The total prediction accuracy reached 99.3%for body fluid mixtures and 98.6%for the mixture proportions.Conclusion These results indicate the excellent capability and powerful value of the multiplex methylation system in the identification of forensic body fluid mixtures.展开更多
Background: Workplace violence (WV) towards psychiatric staff has commonly been associated with Posttraumatic Stress Disorder (PTSD). However, prospective studies have shown that not all psychiatric staff who experien...Background: Workplace violence (WV) towards psychiatric staff has commonly been associated with Posttraumatic Stress Disorder (PTSD). However, prospective studies have shown that not all psychiatric staff who experience workplace violence experience post-traumatic stress. Purpose: We want to examine the longitudinal trajectories of PTSD in this population to identify possible subgroups that might be more at risk. Furthermore, we need to investigate whether certain risk factors of PTSD might identify membership in the subgroups. Method: In a sample of psychiatric staff from 18 psychiatric wards in Denmark who had reported an incident of WV, we used Latent Growth Mixture Modelling (LGMM) and further logistic regression analysis to investigate this. Results: We found three separate PTSD trajectories: a recovering, a delayed-onset, and a moderate-stable trajectory. Higher social support and negative cognitive appraisals about oneself, the world and self-blame predicted membership in the delayed-onset trajectory, while higher social support and lower accept coping predicted membership in the delayed-onset trajectory. Conclusion: Although most psychiatric staff go through a natural recovery, it is important to be aware of and identify staff members who might be struggling long-term. More focus on the factors that might predict these groups should be an important task for psychiatric departments to prevent posttraumatic symptomatology from work.展开更多
Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the...Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the whole range of the losses using a standard loss distribution. We tackle this modeling problem by proposing a three-component spliced regression model that can simultaneously model zeros, moderate and large losses and consider heterogeneous effects in mixture components. To apply our proposed model to Privacy Right Clearinghouse (PRC) data breach chronology, we segment geographical groups using unsupervised cluster analysis, and utilize a covariate-dependent probability to model zero losses, finite mixture distributions for moderate body and an extreme value distribution for large losses capturing the heavy-tailed nature of the loss data. Parameters and coefficients are estimated using the Expectation-Maximization (EM) algorithm. Combining with our frequency model (generalized linear mixed model) for data breaches, aggregate loss distributions are investigated and applications on cyber insurance pricing and risk management are discussed.展开更多
Highly versatile machines, such as wheel loaders, forklifts, and mining haulers, are subject to many kinds of working conditions, as well as indefinite factors that lead to the complexity of the load. The load probabi...Highly versatile machines, such as wheel loaders, forklifts, and mining haulers, are subject to many kinds of working conditions, as well as indefinite factors that lead to the complexity of the load. The load probability distribution function (PDF) of transmission gears has many distributions centers; thus, its PDF cannot be well represented by just a single-peak function. For the purpose of representing the distribution characteristics of the complicated phenomenon accurately, this paper proposes a novel method to establish a mixture model. Based on linear regression models and correlation coefficients, the proposed method can be used to automatically select the best-fitting function in the mixture model. Coefficient of determination, the mean square error, and the maximum deviation are chosen and then used as judging criteria to describe the fitting precision between the theoretical distribution and the corresponding histogram of the available load data. The applicability of this modeling method is illustrated by the field testing data of a wheel loader. Meanwhile, the load spectra based on the mixture model are compiled. The comparison results show that the mixture model is more suitable for the description of the load-distribution characteristics. The proposed research improves the flexibility and intelligence of modeling, reduces the statistical error and enhances the fitting accuracy, and the load spectra complied by this method can better reflect the actual load characteristic of the gear component.展开更多
Cluster-based channel model is the main stream of fifth generation mobile communications, thus the accuracy of clustering algorithm is important. Traditional Gaussian mixture model (GMM) does not consider the power in...Cluster-based channel model is the main stream of fifth generation mobile communications, thus the accuracy of clustering algorithm is important. Traditional Gaussian mixture model (GMM) does not consider the power information which is important for the channel multipath clustering. In this paper, a normalized power weighted GMM (PGMM) is introduced to model the channel multipath components (MPCs). With MPC power as a weighted factor, the PGMM can fit the MPCs in accordance with the cluster-based channel models. Firstly, expectation maximization (EM) algorithm is employed to optimize the PGMM parameters. Then, to further increase the searching ability of EM and choose the optimal number of components without resort to cross-validation, the variational Bayesian (VB) inference is employed. Finally, 28 GHz indoor channel measurement data is used to demonstrate the effectiveness of the PGMM clustering algorithm.展开更多
Mixture model based image segmentation method, which assumes that image pixels are independent and do not consider the position relationship between pixels, is not robust to noise and usually leads to misclassificatio...Mixture model based image segmentation method, which assumes that image pixels are independent and do not consider the position relationship between pixels, is not robust to noise and usually leads to misclassification. A new segmentation method, called multi-resolution Ganssian mixture model method, is proposed. First, an image pyramid is constructed and son-father link relationship is built between each level of pyramid. Then the mixture model segmentation method is applied to the top level. The segmentation result on the top level is passed top-down to the bottom level according to the son-father link relationship between levels. The proposed method considers not only local but also global information of image, it overcomes the effect of noise and can obtain better segmentation result. Experimental result demonstrates its effectiveness.展开更多
The stability of soil-rock mixtures(SRMs) that widely distributed in slopes is of significant concern for slope safety evaluation and disaster prevention. The failure behavior of SRM slopes under surface loading condi...The stability of soil-rock mixtures(SRMs) that widely distributed in slopes is of significant concern for slope safety evaluation and disaster prevention. The failure behavior of SRM slopes under surface loading conditions was investigated through a series of centrifuge model tests considering various volumetric gravel contents. The displacement field of the slope was determined with image-based displacement system to observe the deformation of the soil and the movement of the block during loading in the tests. The test results showed that the ultimate bearing capacity and the stiffness of SRM slopes increased evidently when the volumetric block content exceeded a threshold value. Moreover, there were more evident slips around the blocks in the SRM slope. The microscopic analysis of the block motion showed that the rotation of the blocks could aggravate the deformation localization to facilitate the development of the slip surface. The high correlation between the rotation of the key blocks and the slope failure indicated that the blocks became the dominant load-bearing medium that influenced the slope failure. The blocks in the sliding body formed a chain to bear the load and change the displacement distribution of the adjacent matrix sand through the block rotation.展开更多
The currently prevalent machine performance degradation assessment techniques involve estimating a machine's current condition based upon the recognition of indications of failure features,which entail complete data ...The currently prevalent machine performance degradation assessment techniques involve estimating a machine's current condition based upon the recognition of indications of failure features,which entail complete data collected in different conditions.However,failure data are always hard to acquire,thus making those techniques hard to be applied.In this paper,a novel method which does not need failure history data is introduced.Wavelet packet decomposition(WPD) is used to extract features from raw signals,principal component analysis(PCA) is utilized to reduce feature dimensions,and Gaussian mixture model(GMM) is then applied to approximate the feature space distributions.Single-channel confidence value(SCV) is calculated by the overlap between GMM of the monitoring condition and that of the normal condition,which can indicate the performance of single-channel.Furthermore,multi-channel confidence value(MCV),which can be deemed as the overall performance index of multi-channel,is calculated via logistic regression(LR) and that the task of decision-level sensor fusion is also completed.Both SCV and MCV can serve as the basis on which proactive maintenance measures can be taken,thus preventing machine breakdown.The method has been adopted to assess the performance of the turbine of a centrifugal compressor in a factory of Petro-China,and the result shows that it can effectively complete this task.The proposed method has engineering significance for machine performance degradation assessment.展开更多
An improved approach for J-value segmentation (JSEG) is presented for unsupervised color image segmentation. Instead of color quantization algorithm, an automatic classification method based on adaptive mean shift ...An improved approach for J-value segmentation (JSEG) is presented for unsupervised color image segmentation. Instead of color quantization algorithm, an automatic classification method based on adaptive mean shift (AMS) based clustering is used for nonparametric clustering of image data set. The clustering results are used to construct Gaussian mixture modelling (GMM) of image data for the calculation of soft J value. The region growing algorithm used in JSEG is then applied in segmenting the image based on the multiscale soft J-images. Experiments show that the synergism of JSEG and the soft classification based on AMS based clustering and GMM overcomes the limitations of JSEG successfully and is more robust.展开更多
Actual engineering systems will be inevitably affected by uncertain factors.Thus,the Reliability-Based Multidisciplinary Design Optimization(RBMDO)has become a hotspot for recent research and application in complex en...Actual engineering systems will be inevitably affected by uncertain factors.Thus,the Reliability-Based Multidisciplinary Design Optimization(RBMDO)has become a hotspot for recent research and application in complex engineering system design.The Second-Order/First-Order Mean-Value Saddlepoint Approximate(SOMVSA/-FOMVSA)are two popular reliability analysis strategies that are widely used in RBMDO.However,the SOMVSA method can only be used efficiently when the distribution of input variables is Gaussian distribution,which significantly limits its application.In this study,the Gaussian Mixture Model-based Second-Order Mean-Value Saddlepoint Approximation(GMM-SOMVSA)is introduced to tackle above problem.It is integrated with the Collaborative Optimization(CO)method to solve RBMDO problems.Furthermore,the formula and procedure of RBMDO using GMM-SOMVSA-Based CO(GMM-SOMVSA-CO)are proposed.Finally,an engineering example is given to show the application of the GMM-SOMVSA-CO method.展开更多
The EM algorithm is a very popular maximum likelihood estimation method, the iterative algorithm for solving the maximum likelihood estimator when the observation data is the incomplete data, but also is very effectiv...The EM algorithm is a very popular maximum likelihood estimation method, the iterative algorithm for solving the maximum likelihood estimator when the observation data is the incomplete data, but also is very effective algorithm to estimate the finite mixture model parameters. However, EM algorithm can not guarantee to find the global optimal solution, and often easy to fall into local optimal solution, so it is sensitive to the determination of initial value to iteration. Traditional EM algorithm select the initial value at random, we propose an improved method of selection of initial value. First, we use the k-nearest-neighbor method to delete outliers. Second, use the k-means to initialize the EM algorithm. Compare this method with the original random initial value method, numerical experiments show that the parameter estimation effect of the initialization of the EM algorithm is significantly better than the effect of the original EM algorithm.展开更多
Normal mixture regression models are one of the most important statistical data analysis tools in a heterogeneous population. When the data set under consideration involves asymmetric outcomes, in the last two decades...Normal mixture regression models are one of the most important statistical data analysis tools in a heterogeneous population. When the data set under consideration involves asymmetric outcomes, in the last two decades, the skew normal distribution has been shown beneficial in dealing with asymmetric data in various theoretic and applied problems. In this paper, we propose and study a novel class of models: a skew-normal mixture of joint location, scale and skewness models to analyze the heteroscedastic skew-normal data coming from a heterogeneous population. The issues of maximum likelihood estimation are addressed. In particular, an Expectation-Maximization (EM) algorithm for estimating the model parameters is developed. Properties of the estimators of the regression coefficients are evaluated through Monte Carlo experiments. Results from the analysis of a real data set from the Body Mass Index (BMI) data are presented.展开更多
A three-dimensional(3D) lattice model for predicting the rheological behavior of asphalt mixtures was presented.In this model asphalt mixtures were described as a two-phase composite material consisting of asphalt san...A three-dimensional(3D) lattice model for predicting the rheological behavior of asphalt mixtures was presented.In this model asphalt mixtures were described as a two-phase composite material consisting of asphalt sand and coarse aggregates distributed randomly.Asphalt sand was regarded as a viscoelastic material and aggregates as an elastic material.The rheological response of asphalt mixture subjected to different constant stresses was simulated.The calibrated overall creep strain shows a good approximation to experimental results.展开更多
The appearance of a face is severely altered by illumination conditions that makes automatic face recognition a challenging task. In this paper we propose a Gaussian Mixture Models (GMM)-based human face identificatio...The appearance of a face is severely altered by illumination conditions that makes automatic face recognition a challenging task. In this paper we propose a Gaussian Mixture Models (GMM)-based human face identification technique built in the Fourier or frequency domain that is robust to illumination changes and does not require “illumination normalization” (removal of illumination effects) prior to application unlike many existing methods. The importance of the Fourier domain phase in human face identification is a well-established fact in signal processing. A maximum a posteriori (or, MAP) estimate based on the posterior likelihood is used to perform identification, achieving misclassification error rates as low as 2% on a database that contains images of 65 individuals under 21 different illumination conditions. Furthermore, a misclassification rate of 3.5% is observed on the Yale database with 10 people and 64 different illumination conditions. Both these sets of results are significantly better than those obtained from traditional PCA and LDA classifiers. Statistical analysis pertaining to model selection is also presented.展开更多
MODIS (Moderate Resolution Imaging Spectroradiometer) is a key instrument aboard the Terra (EOS AM) and Aqua (EOS PM) satellites. Linear spectral mixture models are applied to MOIDS data for the sub-pixel classi...MODIS (Moderate Resolution Imaging Spectroradiometer) is a key instrument aboard the Terra (EOS AM) and Aqua (EOS PM) satellites. Linear spectral mixture models are applied to MOIDS data for the sub-pixel classification of land covers. Shaoxing county of Zhejiang Province in China was chosen to be the study site and early rice was selected as the study crop. The derived proportions of land covers from MODIS pixel using linear spectral mixture models were compared with unsupervised classification derived from TM data acquired on the same day, which implies that MODIS data could be used as satellite data source for rice cultivation area estimation, possibly rice growth monitoring and yield forecasting on the regional scale.展开更多
Reliable process monitoring is important for ensuring process safety and product quality.A production process is generally characterized bymultiple operation modes,and monitoring thesemultimodal processes is challengi...Reliable process monitoring is important for ensuring process safety and product quality.A production process is generally characterized bymultiple operation modes,and monitoring thesemultimodal processes is challenging.Most multimodal monitoring methods rely on the assumption that the modes are independent of each other,which may not be appropriate for practical application.This study proposes a transition-constrained Gaussian mixture model method for efficient multimodal process monitoring.This technique can reduce falsely and frequently occurring mode transitions by considering the time series information in the mode identification of historical and online data.This process enables the identified modes to reflect the stability of actual working conditions,improve mode identification accuracy,and enhance monitoring reliability in cases of mode overlap.Case studies on a numerical simulation example and simulation of the penicillin fermentation process are provided to verify the effectiveness of the proposed approach inmultimodal process monitoring with mode overlap.展开更多
基金funded by the National Key Research and Development Program of China(No.2022YFD2200503-02)。
文摘The diameter distribution function(DDF)is a crucial tool for accurately predicting stand carbon storage(CS).The current key issue,however,is how to construct a high-precision DDF based on stand factors,site quality,and aridity index to predict stand CS in multi-species mixed forests with complex structures.This study used data from70 survey plots for mixed broadleaf Populus davidiana and Betula platyphylla forests in the Mulan Rangeland State Forest,Hebei Province,China,to construct the DDF based on maximum likelihood estimation and finite mixture model(FMM).Ordinary least squares(OLS),linear seemingly unrelated regression(LSUR),and back propagation neural network(BPNN)were used to investigate the influences of stand factors,site quality,and aridity index on the shape and scale parameters of DDF and predicted stand CS of mixed broadleaf forests.The results showed that FMM accurately described the stand-level diameter distribution of the mixed P.davidiana and B.platyphylla forests;whereas the Weibull function constructed by MLE was more accurate in describing species-level diameter distribution.The combined variable of quadratic mean diameter(Dq),stand basal area(BA),and site quality improved the accuracy of the shape parameter models of FMM;the combined variable of Dq,BA,and De Martonne aridity index improved the accuracy of the scale parameter models.Compared to OLS and LSUR,the BPNN had higher accuracy in the re-parameterization process of FMM.OLS,LSUR,and BPNN overestimated the CS of P.davidiana but underestimated the CS of B.platyphylla in the large diameter classes(DBH≥18 cm).BPNN accurately estimated stand-and species-level CS,but it was more suitable for estimating stand-level CS compared to species-level CS,thereby providing a scientific basis for the optimization of stand structure and assessment of carbon sequestration capacity in mixed broadleaf forests.
文摘Classical survival analysis assumes all subjects will experience the event of interest, but in some cases, a portion of the population may never encounter the event. These survival methods further assume independent survival times, which is not valid for honey bees, which live in nests. The study introduces a semi-parametric marginal proportional hazards mixture cure (PHMC) model with exchangeable correlation structure, using generalized estimating equations for survival data analysis. The model was tested on clustered right-censored bees survival data with a cured fraction, where two bee species were subjected to different entomopathogens to test the effect of the entomopathogens on the survival of the bee species. The Expectation-Solution algorithm is used to estimate the parameters. The study notes a weak positive association between cure statuses (ρ1=0.0007) and survival times for uncured bees (ρ2=0.0890), emphasizing their importance. The odds of being uncured for A. mellifera is higher than the odds for species M. ferruginea. The bee species, A. mellifera are more susceptible to entomopathogens icipe 7, icipe 20, and icipe 69. The Cox-Snell residuals show that the proposed semiparametric PH model generally fits the data well as compared to model that assume independent correlation structure. Thus, the semi parametric marginal proportional hazards mixture cure is parsimonious model for correlated bees survival data.
基金supported by the National Natural Science Foundation of China (61903326, 61933015)。
文摘The large blast furnace is essential equipment in the process of iron and steel manufacturing. Due to the complex operation process and frequent fluctuations of variables, conventional monitoring methods often bring false alarms. To address the above problem, an ensemble of greedy dynamic principal component analysis-Gaussian mixture model(EGDPCA-GMM) is proposed in this paper. First, PCA-GMM is introduced to deal with the collinearity and the non-Gaussian distribution of blast furnace data.Second, in order to explain the dynamics of data, the greedy algorithm is used to determine the extended variables and their corresponding time lags, so as to avoid introducing unnecessary noise. Then the bagging ensemble is adopted to cooperate with greedy extension to eliminate the randomness brought by the greedy algorithm and further reduce the false alarm rate(FAR) of monitoring results. Finally, the algorithm is applied to the blast furnace of a large iron and steel group in South China to verify performance.Compared with the basic algorithms, the proposed method achieves lowest FAR, while keeping missed alarm rate(MAR) remain stable.
基金funded by the Science and Technology Research Program of Chongqing Municipal Education Commission(grant number KJZD-K202100705)the Talents Program Supply System of Chongqing(grant number cstc2022ycjhbgzxm0080)。
文摘The relationship between the water content or saturation of unsaturated soils and its matrix suction is commonly described by the soilwater characteristic curve(SWCC).Currently,study on the SWCC model is focused on fine-grained soils like clay and silty soils,but the SWCC model for grinding soil-rock mixture(SRM)is less studied.Considering that the SRM is in a certain compaction state in the actual project,this study established a surface model with three variables of coupling compaction degree-substrate suction-moisture content based on the Cavalcante-Zornberg soil-water characteristic curve model.Then,the influence of each fitting parameter on the curve was analyzed.For the common SRM,the soil-water characteristic test was conducted.Moreover,the experimental measurements exhibit remarkable consistency with the mode surface.The analysis shows that the surface model intuitively describes the soil-water characteristics of grinding SRM,which can provide the SWCC of soils with bimodal pore characteristics under specific compaction degrees.Furthermore,it can reflect the influence of compaction degrees on the SWCC of rock-soil mass and has a certain prediction effect.The SWCC of SRM with various soil-rock ratios have a double-step shape.With the increase in compaction degree,the curves as a whole tend toward decreasing mass moisture content.The curve changes are mainly concentrated in the large pore section.
基金supported by the grants from the Natural Science Foundation of Hubei Province(No.2020CFB780)the Fundamental Research Funds for the Central Universities(No.2017KFYXJJ020).
文摘Objective Body fluid mixtures are complex biological samples that frequently occur in crime scenes,and can provide important clues for criminal case analysis.DNA methylation assay has been applied in the identification of human body fluids,and has exhibited excellent performance in predicting single-source body fluids.The present study aims to develop a methylation SNaPshot multiplex system for body fluid identification,and accurately predict the mixture samples.In addition,the value of DNA methylation in the prediction of body fluid mixtures was further explored.Methods In the present study,420 samples of body fluid mixtures and 250 samples of single body fluids were tested using an optimized multiplex methylation system.Each kind of body fluid sample presented the specific methylation profiles of the 10 markers.Results Significant differences in methylation levels were observed between the mixtures and single body fluids.For all kinds of mixtures,the Spearman’s correlation analysis revealed a significantly strong correlation between the methylation levels and component proportions(1:20,1:10,1:5,1:1,5:1,10:1 and 20:1).Two random forest classification models were trained for the prediction of mixture types and the prediction of the mixture proportion of 2 components,based on the methylation levels of 10 markers.For the mixture prediction,Model-1 presented outstanding prediction accuracy,which reached up to 99.3%in 427 training samples,and had a remarkable accuracy of 100%in 243 independent test samples.For the mixture proportion prediction,Model-2 demonstrated an excellent accuracy of 98.8%in 252 training samples,and 98.2%in 168 independent test samples.The total prediction accuracy reached 99.3%for body fluid mixtures and 98.6%for the mixture proportions.Conclusion These results indicate the excellent capability and powerful value of the multiplex methylation system in the identification of forensic body fluid mixtures.
文摘Background: Workplace violence (WV) towards psychiatric staff has commonly been associated with Posttraumatic Stress Disorder (PTSD). However, prospective studies have shown that not all psychiatric staff who experience workplace violence experience post-traumatic stress. Purpose: We want to examine the longitudinal trajectories of PTSD in this population to identify possible subgroups that might be more at risk. Furthermore, we need to investigate whether certain risk factors of PTSD might identify membership in the subgroups. Method: In a sample of psychiatric staff from 18 psychiatric wards in Denmark who had reported an incident of WV, we used Latent Growth Mixture Modelling (LGMM) and further logistic regression analysis to investigate this. Results: We found three separate PTSD trajectories: a recovering, a delayed-onset, and a moderate-stable trajectory. Higher social support and negative cognitive appraisals about oneself, the world and self-blame predicted membership in the delayed-onset trajectory, while higher social support and lower accept coping predicted membership in the delayed-onset trajectory. Conclusion: Although most psychiatric staff go through a natural recovery, it is important to be aware of and identify staff members who might be struggling long-term. More focus on the factors that might predict these groups should be an important task for psychiatric departments to prevent posttraumatic symptomatology from work.
文摘Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the whole range of the losses using a standard loss distribution. We tackle this modeling problem by proposing a three-component spliced regression model that can simultaneously model zeros, moderate and large losses and consider heterogeneous effects in mixture components. To apply our proposed model to Privacy Right Clearinghouse (PRC) data breach chronology, we segment geographical groups using unsupervised cluster analysis, and utilize a covariate-dependent probability to model zero losses, finite mixture distributions for moderate body and an extreme value distribution for large losses capturing the heavy-tailed nature of the loss data. Parameters and coefficients are estimated using the Expectation-Maximization (EM) algorithm. Combining with our frequency model (generalized linear mixed model) for data breaches, aggregate loss distributions are investigated and applications on cyber insurance pricing and risk management are discussed.
基金supported by National Natural Science Foundation of China (Grant Nos. 50805065, 51075179)
文摘Highly versatile machines, such as wheel loaders, forklifts, and mining haulers, are subject to many kinds of working conditions, as well as indefinite factors that lead to the complexity of the load. The load probability distribution function (PDF) of transmission gears has many distributions centers; thus, its PDF cannot be well represented by just a single-peak function. For the purpose of representing the distribution characteristics of the complicated phenomenon accurately, this paper proposes a novel method to establish a mixture model. Based on linear regression models and correlation coefficients, the proposed method can be used to automatically select the best-fitting function in the mixture model. Coefficient of determination, the mean square error, and the maximum deviation are chosen and then used as judging criteria to describe the fitting precision between the theoretical distribution and the corresponding histogram of the available load data. The applicability of this modeling method is illustrated by the field testing data of a wheel loader. Meanwhile, the load spectra based on the mixture model are compiled. The comparison results show that the mixture model is more suitable for the description of the load-distribution characteristics. The proposed research improves the flexibility and intelligence of modeling, reduces the statistical error and enhances the fitting accuracy, and the load spectra complied by this method can better reflect the actual load characteristic of the gear component.
基金supported by National Science and Technology Major Program of the Ministry of Science and Technology (No.2018ZX03001031)Key program of Beijing Municipal Natural Science Foundation (No. L172030)+2 种基金Beijing Municipal Science & Technology Commission Project (No. Z171100005217001)Key Project of State Key Lab of Networking and Switching Technology (NST20170205)National Key Technology Research and Development Program of the Ministry of Science and Technology of China (NO. 2012BAF14B01)
文摘Cluster-based channel model is the main stream of fifth generation mobile communications, thus the accuracy of clustering algorithm is important. Traditional Gaussian mixture model (GMM) does not consider the power information which is important for the channel multipath clustering. In this paper, a normalized power weighted GMM (PGMM) is introduced to model the channel multipath components (MPCs). With MPC power as a weighted factor, the PGMM can fit the MPCs in accordance with the cluster-based channel models. Firstly, expectation maximization (EM) algorithm is employed to optimize the PGMM parameters. Then, to further increase the searching ability of EM and choose the optimal number of components without resort to cross-validation, the variational Bayesian (VB) inference is employed. Finally, 28 GHz indoor channel measurement data is used to demonstrate the effectiveness of the PGMM clustering algorithm.
基金This project was supported by the National Natural Foundation of China (60404022) and the Foundation of Department ofEducation of Hebei Province (2002209).
文摘Mixture model based image segmentation method, which assumes that image pixels are independent and do not consider the position relationship between pixels, is not robust to noise and usually leads to misclassification. A new segmentation method, called multi-resolution Ganssian mixture model method, is proposed. First, an image pyramid is constructed and son-father link relationship is built between each level of pyramid. Then the mixture model segmentation method is applied to the top level. The segmentation result on the top level is passed top-down to the bottom level according to the son-father link relationship between levels. The proposed method considers not only local but also global information of image, it overcomes the effect of noise and can obtain better segmentation result. Experimental result demonstrates its effectiveness.
基金supported by National Key R&D Program of China(2018YFC1508503)
文摘The stability of soil-rock mixtures(SRMs) that widely distributed in slopes is of significant concern for slope safety evaluation and disaster prevention. The failure behavior of SRM slopes under surface loading conditions was investigated through a series of centrifuge model tests considering various volumetric gravel contents. The displacement field of the slope was determined with image-based displacement system to observe the deformation of the soil and the movement of the block during loading in the tests. The test results showed that the ultimate bearing capacity and the stiffness of SRM slopes increased evidently when the volumetric block content exceeded a threshold value. Moreover, there were more evident slips around the blocks in the SRM slope. The microscopic analysis of the block motion showed that the rotation of the blocks could aggravate the deformation localization to facilitate the development of the slip surface. The high correlation between the rotation of the key blocks and the slope failure indicated that the blocks became the dominant load-bearing medium that influenced the slope failure. The blocks in the sliding body formed a chain to bear the load and change the displacement distribution of the adjacent matrix sand through the block rotation.
基金supported by National Key Natural Science Foundation of China (Grant No. 50635010)
文摘The currently prevalent machine performance degradation assessment techniques involve estimating a machine's current condition based upon the recognition of indications of failure features,which entail complete data collected in different conditions.However,failure data are always hard to acquire,thus making those techniques hard to be applied.In this paper,a novel method which does not need failure history data is introduced.Wavelet packet decomposition(WPD) is used to extract features from raw signals,principal component analysis(PCA) is utilized to reduce feature dimensions,and Gaussian mixture model(GMM) is then applied to approximate the feature space distributions.Single-channel confidence value(SCV) is calculated by the overlap between GMM of the monitoring condition and that of the normal condition,which can indicate the performance of single-channel.Furthermore,multi-channel confidence value(MCV),which can be deemed as the overall performance index of multi-channel,is calculated via logistic regression(LR) and that the task of decision-level sensor fusion is also completed.Both SCV and MCV can serve as the basis on which proactive maintenance measures can be taken,thus preventing machine breakdown.The method has been adopted to assess the performance of the turbine of a centrifugal compressor in a factory of Petro-China,and the result shows that it can effectively complete this task.The proposed method has engineering significance for machine performance degradation assessment.
文摘An improved approach for J-value segmentation (JSEG) is presented for unsupervised color image segmentation. Instead of color quantization algorithm, an automatic classification method based on adaptive mean shift (AMS) based clustering is used for nonparametric clustering of image data set. The clustering results are used to construct Gaussian mixture modelling (GMM) of image data for the calculation of soft J value. The region growing algorithm used in JSEG is then applied in segmenting the image based on the multiscale soft J-images. Experiments show that the synergism of JSEG and the soft classification based on AMS based clustering and GMM overcomes the limitations of JSEG successfully and is more robust.
基金support from the National Natural Science Foundation of China(Grant No.52175130)the Sichuan Science and Technology Program(Grant No.2021YFS0336)+4 种基金the China Postdoctoral Science Foundation(Grant No.2021M700693)the 2021 Open Project of Failure Mechanics and Engineering Disaster Prevention,Key Lab of Sichuan Province(Grant No.FMEDP202104)the Fundamental Research Funds for the Central Universities(Grant No.ZYGX2019J035)the Sichuan Science and Technology Innovation Seedling Project Funding Project(Grant No.2021112)the Sichuan Special Equipment Inspection and Research Institute(YNJD-02-2020)are gratefully acknowledged.
文摘Actual engineering systems will be inevitably affected by uncertain factors.Thus,the Reliability-Based Multidisciplinary Design Optimization(RBMDO)has become a hotspot for recent research and application in complex engineering system design.The Second-Order/First-Order Mean-Value Saddlepoint Approximate(SOMVSA/-FOMVSA)are two popular reliability analysis strategies that are widely used in RBMDO.However,the SOMVSA method can only be used efficiently when the distribution of input variables is Gaussian distribution,which significantly limits its application.In this study,the Gaussian Mixture Model-based Second-Order Mean-Value Saddlepoint Approximation(GMM-SOMVSA)is introduced to tackle above problem.It is integrated with the Collaborative Optimization(CO)method to solve RBMDO problems.Furthermore,the formula and procedure of RBMDO using GMM-SOMVSA-Based CO(GMM-SOMVSA-CO)are proposed.Finally,an engineering example is given to show the application of the GMM-SOMVSA-CO method.
文摘The EM algorithm is a very popular maximum likelihood estimation method, the iterative algorithm for solving the maximum likelihood estimator when the observation data is the incomplete data, but also is very effective algorithm to estimate the finite mixture model parameters. However, EM algorithm can not guarantee to find the global optimal solution, and often easy to fall into local optimal solution, so it is sensitive to the determination of initial value to iteration. Traditional EM algorithm select the initial value at random, we propose an improved method of selection of initial value. First, we use the k-nearest-neighbor method to delete outliers. Second, use the k-means to initialize the EM algorithm. Compare this method with the original random initial value method, numerical experiments show that the parameter estimation effect of the initialization of the EM algorithm is significantly better than the effect of the original EM algorithm.
基金Supported by the National Natural Science Foundation of China(11261025,11561075)the Natural Science Foundation of Yunnan Province(2016FB005)the Program for Middle-aged Backbone Teacher,Yunnan University
文摘Normal mixture regression models are one of the most important statistical data analysis tools in a heterogeneous population. When the data set under consideration involves asymmetric outcomes, in the last two decades, the skew normal distribution has been shown beneficial in dealing with asymmetric data in various theoretic and applied problems. In this paper, we propose and study a novel class of models: a skew-normal mixture of joint location, scale and skewness models to analyze the heteroscedastic skew-normal data coming from a heterogeneous population. The issues of maximum likelihood estimation are addressed. In particular, an Expectation-Maximization (EM) algorithm for estimating the model parameters is developed. Properties of the estimators of the regression coefficients are evaluated through Monte Carlo experiments. Results from the analysis of a real data set from the Body Mass Index (BMI) data are presented.
基金Project(10672063) supported by the National Natural Science Foundation of China
文摘A three-dimensional(3D) lattice model for predicting the rheological behavior of asphalt mixtures was presented.In this model asphalt mixtures were described as a two-phase composite material consisting of asphalt sand and coarse aggregates distributed randomly.Asphalt sand was regarded as a viscoelastic material and aggregates as an elastic material.The rheological response of asphalt mixture subjected to different constant stresses was simulated.The calibrated overall creep strain shows a good approximation to experimental results.
文摘The appearance of a face is severely altered by illumination conditions that makes automatic face recognition a challenging task. In this paper we propose a Gaussian Mixture Models (GMM)-based human face identification technique built in the Fourier or frequency domain that is robust to illumination changes and does not require “illumination normalization” (removal of illumination effects) prior to application unlike many existing methods. The importance of the Fourier domain phase in human face identification is a well-established fact in signal processing. A maximum a posteriori (or, MAP) estimate based on the posterior likelihood is used to perform identification, achieving misclassification error rates as low as 2% on a database that contains images of 65 individuals under 21 different illumination conditions. Furthermore, a misclassification rate of 3.5% is observed on the Yale database with 10 people and 64 different illumination conditions. Both these sets of results are significantly better than those obtained from traditional PCA and LDA classifiers. Statistical analysis pertaining to model selection is also presented.
文摘MODIS (Moderate Resolution Imaging Spectroradiometer) is a key instrument aboard the Terra (EOS AM) and Aqua (EOS PM) satellites. Linear spectral mixture models are applied to MOIDS data for the sub-pixel classification of land covers. Shaoxing county of Zhejiang Province in China was chosen to be the study site and early rice was selected as the study crop. The derived proportions of land covers from MODIS pixel using linear spectral mixture models were compared with unsupervised classification derived from TM data acquired on the same day, which implies that MODIS data could be used as satellite data source for rice cultivation area estimation, possibly rice growth monitoring and yield forecasting on the regional scale.
基金supported in part by National Natural Science Foundation of China under Grants 61973119 and 61603138in part by Shanghai Rising-Star Program under Grant 20QA1402600+1 种基金in part by the Open Funding from Shandong Key Laboratory of Big-data Driven Safety Control Technology for Complex Systems under Grant SKDN202001in part by the Programme of Introducing Talents of Discipline to Universities(the 111 Project)under Grant B17017.
文摘Reliable process monitoring is important for ensuring process safety and product quality.A production process is generally characterized bymultiple operation modes,and monitoring thesemultimodal processes is challenging.Most multimodal monitoring methods rely on the assumption that the modes are independent of each other,which may not be appropriate for practical application.This study proposes a transition-constrained Gaussian mixture model method for efficient multimodal process monitoring.This technique can reduce falsely and frequently occurring mode transitions by considering the time series information in the mode identification of historical and online data.This process enables the identified modes to reflect the stability of actual working conditions,improve mode identification accuracy,and enhance monitoring reliability in cases of mode overlap.Case studies on a numerical simulation example and simulation of the penicillin fermentation process are provided to verify the effectiveness of the proposed approach inmultimodal process monitoring with mode overlap.