An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the ...An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the completed likelihood minimum message length criterion, is derived. It can measure both the goodness-of-fit of the candidate GMM to the data and the goodness-of-partition of the data. Secondly, by utilizing the proposed criterion as the clustering objective function, an improved expectation- maximization (EM) algorithm is developed, which can avoid poor local optimal solutions compared to the standard EM algorithm for estimating the model parameters. The experimental results demonstrate that the proposed method can rectify the over-fitting tendency of representative GMM-based clustering approaches and can robustly provide more accurate clustering results.展开更多
Complex industrial processes often have multiple operating modes and present time-varying behavior. The data in one mode may follow specific Gaussian or non-Gaussian distributions. In this paper, a numerically efficie...Complex industrial processes often have multiple operating modes and present time-varying behavior. The data in one mode may follow specific Gaussian or non-Gaussian distributions. In this paper, a numerically efficient movingwindow local outlier probability algorithm is proposed, lies key feature is the capability to handle complex data distributions and incursive operating condition changes including slow dynamic variations and instant mode shifts. First, a two-step adaption approach is introduced and some designed updating rules are applied to keep the monitoring model up-to-date. Then, a semi-supervised monitoring strategy is developed with an updating switch rule to deal with mode changes. Based on local probability models, the algorithm has a superior ability in detecting faulty conditions and fast adapting to slow variations and new operating modes. Finally, the utility of the proposed method is demonstrated with a numerical example and a non-isothermal continuous stirred tank reactor.展开更多
Turbulent dynamical systems involve dynamics with both a large dimensional phase space and a large number of positive Lyapunov exponents. Such systems are ubiqui- tous in applications in contemporary science and engin...Turbulent dynamical systems involve dynamics with both a large dimensional phase space and a large number of positive Lyapunov exponents. Such systems are ubiqui- tous in applications in contemporary science and engineering where the statistical ensemble prediction and the real time filtering/state estimation are needed despite the underlying complexity of the system. Statistically exactly solvable test models have a crucial role to provide firm mathematical underpinning or new algorithms for vastly more complex scien- tific phenomena. Here, a class of statistically exactly solvable non-Gaussian test models is introduced, where a generalized Feynman-Ka~ formulation reduces the exact behavior of conditional statistical moments to the solution to inhomogeneous Fokker-Planck equations modified by linear lower order coupling and source terms. This procedure is applied to a test model with hidden instabilities and is combined with information theory to address two important issues in the contemporary statistical prediction of turbulent dynamical systems: the coarse-grained ensemble prediction in a perfect model and the improving long range forecasting in imperfect models. The models discussed here should be use- ful for many other applications and algorithms for the real time prediction and the state estimation.展开更多
基金The National Natural Science Foundation of China(No.61105048,60972165)the Doctoral Fund of Ministry of Education of China(No.20110092120034)+2 种基金the Natural Science Foundation of Jiangsu Province(No.BK2010240)the Technology Foundation for Selected Overseas Chinese Scholar,Ministry of Human Resources and Social Security of China(No.6722000008)the Open Fund of Jiangsu Province Key Laboratory for Remote Measuring and Control(No.YCCK201005)
文摘An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the completed likelihood minimum message length criterion, is derived. It can measure both the goodness-of-fit of the candidate GMM to the data and the goodness-of-partition of the data. Secondly, by utilizing the proposed criterion as the clustering objective function, an improved expectation- maximization (EM) algorithm is developed, which can avoid poor local optimal solutions compared to the standard EM algorithm for estimating the model parameters. The experimental results demonstrate that the proposed method can rectify the over-fitting tendency of representative GMM-based clustering approaches and can robustly provide more accurate clustering results.
基金Supported by the National Natural Science Foundation of China(61374140)Shanghai Postdoctoral Sustentation Fund(12R21412600)+1 种基金the Fundamental Research Funds for the Central Universities(WH1214039)Shanghai Pujiang Program(12PJ1402200)
文摘Complex industrial processes often have multiple operating modes and present time-varying behavior. The data in one mode may follow specific Gaussian or non-Gaussian distributions. In this paper, a numerically efficient movingwindow local outlier probability algorithm is proposed, lies key feature is the capability to handle complex data distributions and incursive operating condition changes including slow dynamic variations and instant mode shifts. First, a two-step adaption approach is introduced and some designed updating rules are applied to keep the monitoring model up-to-date. Then, a semi-supervised monitoring strategy is developed with an updating switch rule to deal with mode changes. Based on local probability models, the algorithm has a superior ability in detecting faulty conditions and fast adapting to slow variations and new operating modes. Finally, the utility of the proposed method is demonstrated with a numerical example and a non-isothermal continuous stirred tank reactor.
基金Project supported by the Office of Naval Research (ONR) Grants (No. ONR DRI N00014-10-1-0554)the DOD-MURI award "Physics Constrained Stochastic-Statistical Models for Extended Range Environmental Prediction"
文摘Turbulent dynamical systems involve dynamics with both a large dimensional phase space and a large number of positive Lyapunov exponents. Such systems are ubiqui- tous in applications in contemporary science and engineering where the statistical ensemble prediction and the real time filtering/state estimation are needed despite the underlying complexity of the system. Statistically exactly solvable test models have a crucial role to provide firm mathematical underpinning or new algorithms for vastly more complex scien- tific phenomena. Here, a class of statistically exactly solvable non-Gaussian test models is introduced, where a generalized Feynman-Ka~ formulation reduces the exact behavior of conditional statistical moments to the solution to inhomogeneous Fokker-Planck equations modified by linear lower order coupling and source terms. This procedure is applied to a test model with hidden instabilities and is combined with information theory to address two important issues in the contemporary statistical prediction of turbulent dynamical systems: the coarse-grained ensemble prediction in a perfect model and the improving long range forecasting in imperfect models. The models discussed here should be use- ful for many other applications and algorithms for the real time prediction and the state estimation.